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IRE TRANSACTIONS ON INFORMATION THEORY 


Sputnik, Et Cetera 


W. B. DAVENPORT, JR. 


It is becoming a tradition of the IRE Professional 
Group on Information Theory to have its Chairman 
contribute an editorial to the December issue of 
these TRANSACTIONS. In keeping with this tradition, 
I would like to add a few words of my own to the 
daily growing clamor occasioned by the satellite 
Sputnik. 

Ever since the first Russian satellite was launched 
into its orbit on October 4, 1957, a torrent of words 
has been inundating the daily press bearing charges 
and countercharges as to why the U.S. lags the 
U.S.S.R. in the development of missiles and sat- 
ellites. The main topics of this discussion seem to be: 
1) the importance of basic research, 2) the impact 
of the present and past Federal fiscal policies on 
military research and development programs, 3) 
the effect of interservice rivalries on ballistic missile 
development and procurement, and 4) the relative 
merits of the U.S. and the U.S.8.R. programs for 
the education of scientists and engineers. While all 
of these topics are important, I shall confine my 
remarks to the first—at the risk of ignoring the old 
Italian proverb, ‘‘A missionary makes few converts 
amongst true believers.” 

A cursory review of the statements of the past 
several months would give one the feeling that, of 
course, we are all in favor of basic research. Certainly 
the development of atomic and nuclear weaponry 
resulting from the early studies of the atomic 
nucleus, the development of the transistor resulting 
from basic studies of the solid state, and the practical 
developments resulting from the theories of Shannon 


AF) 


and Wiener are all well-known stories and should 
provide incontrovertible evidence for the necessity 
of continually supporting free and untrammeled 
basic research. Unfortunately, the facts are other- 
wise. Basic research, or the gathering of new funda- 
mental knowledge, is all too often treated as some- 
thing to be dispensed with in times of crisis, not only 
by responsible public officials but also by many 
scientists and engineers themselves. I feel that this 
is a fundamental error. Sooner or later we are going 
to have to face the fact that crises will always be 
with us. 

Regardless of what we do now to meet the crisis 
of the present, we must also maintain an active 
program of basic research in all the sciences now or 
we will be unable to meet the crises of the future. 
This must be kept in mind throughout our present 
necessary, and costly, rush to push ballistic missiles 
into production. 

Therefore, may I add one small voice to the 
plea for reason. Basic research must never be 
abandoned in the interest of expediency, and in 
fact should be emphasized—in direct proportion 
to the increased effort applied to the opportune 
developments. This holds as much in our own 
field of Information Theory as in the other dis- 
ciplines. Hence, it is my earnest hope that we 
shall effect an appropriate balance between basic 
advances and practical applications, as previous 
contributors to this space have entreated, and that 
we will not be unduly influenced by the force of 
temporal circumstance, expedience, or Sputniks. 


A Theory of Multilevel Information Channel 


with Gaussian Noise’ 
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SATOSI WATANABE 


Summary—tThe interval between the “0” level and ‘‘1”’ level of 
a binary channel with Gaussian noise is subdivided to provide 
n = 2° levels per symbol. The channel capacity is computed as a 
function of g and of the signal-noise ratio D of the original binary 
channel. For a sufficiently large, fixed value of D, if we increase g 
indefinitely, the channel capacity approaches the logarithm of D, 
as can be expected from continuous channel. For a fixed value of 
g, if we increase D indefinitely, the channel capacity approaches g. 
For a given value of D, there is a certain value of g, beyond which 
the channel capacity does not appreciably increase any longer by 
increasing g. The problem is first solved by a simplifying model, 
and then the error introduced by this simplification is estimated. 


“Ring” MopEL 


HE original problem sketched in the Summary 

pertains, in brief, to the question as to how far we 

can expect to be able to improve the capacity of 
a given binary channel by splitting the level interval into 
more than one subinterval, when the binary channel has 
a (symmetric) probability of error p; 7.e., when its 
capacity is given by 


C= aap lor pet (leap) ont =ep)), (i) 


where log stands for logarithm to the base 2. Thus, the 
problem may be characterized as that of an amplitude- 
modulated information channel. In order to obtain a 
qualitative insight into the general nature of such a 
channel, a simple Gaussian type of noise will be assumed. 

Before coping with this problem, we shall introduce 
another one, which by itself is a meaningful one based on 
a set of well-defined assumptions, and whose solution is 
relatively easy, although it may not directly correspond 
to a realistic technological problem. We shall see in a 
later section how we can derive conclusions pertinent to 
the original problem from this “model,” and we shall 
also evaluate the error committed by substituting this 
model for the original problem. 

The problem we attack first is as follows: each symbol 
of the code is represented by one of the 7 available points 
separated by ares of equal length 1 = L/n on a circle of 
circumference L. We consider only those n’s that satisfy 
n = 2’ with g positive integers. The basic assumption we 
make is that the error distribution of the physical quantity 
implementing this channel, whose domain of variation 
here is the circle, is of the Gaussian type with standard 
deviation o. Hereafter, we shall measure any arc of the 
circle by the unit of o. Thus, the total circumference will 
be given by G = L/c, and the interval by d = l/c = 
L/(nc) = G/n. 


* Manuscript received by the PGIT, February 13, 1957. 
+ IBM Research Lab., Poughkeepsie, N. Y. 


If any one of the 7 possibilities is transmitted; 7.e., if 
any one (call this one P,) of the m points on the circle is 
sent, the corresponding point at the receiving end will be 
distributed with the probability distribution: 


f(x) = (1/-V/2m) exp (—2”/2) (2) 


on the circle about the point Po. It will be natural to 
assume that if the received point x falls on the interval: 
rd — (1/2)d < x < rd + (1/2)d with r an integer, the 
recelver will recognize the signal as a point P which is rd 
apart from the original point P,; 7.e., the rth neighbor of 
P,. We shall identify the relative position of P by the 
integer r which is limited by: —(1/2)n < r < (1/2)n. 

Then the probability of the transmitted signal Po» 
being registered by the receiver as its rth neighbor will 
be given by 


p(r) = ps) ‘| 


where the integrand is given by (2). For instance, the 
term k = 1 in (8) represents the contribution from the 
error going, in one direction, one entire circumference 
plus rd. We have obviously 


p(r) = p(—r); Di ptr) =1, -(/2)<r<@2. @ 


rd+(d/2)+kG@ 


f(x) de, (3) 


d—(d/2) +kG 


If we denote by p,(j) the probability of the transmitted 
signal 7 being received as signal j, and by w; the weight 
with which signal 7 is used by the transmitter, then the 
theoretical channel capacity is given by’ 

Oe Dy Dy wip.(j) [log pj) — log >) wip,(3)].- (5) 
7 a k 
In the following, we shall assume that each signal is used 
by the transmitter with equal weight; 7.e., w; = 1/n = 
1/2’. Then (5) becomes 


C = logn + (1/n) » dX p.(q) log p.(3) 
— G/n) » Pj) log PG), ©) 
with , 
ie ds Pili). (7) 


In our case, we have obviously P(j) = 1, hence the 
channel capacity is given by 


1 This is the channel capacity when the w’s are fixed. The maxi- 
mum of this quantity obtained by varying the w’s is what is usually 
called channel capacity. See, for example, C. E. Shannon and W 
Weaver, “The Mathematical Theory of Communication,” Univer- 
sity of Illinois Press, Urbana, IIl., p. 38; 1949. 
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=e pe leep®), “= @/2=r <= h 2) 96) 
It is easy to see that for extremely high noise and 
xtremely low noise, (8) with (3) gives 

mumC = ¢-— logn = 0; and lim.C = g. (9) 

o> a0 

Furthermore, we can expect, and we shall indeed 
emonstrate later, that for a given value of G and in the 
imit g — ©, we get a channel capacity per symbol equal 
o the logarithm of the signal-noise ratio G, since the 
_ature of the problem approaches a continuous channel 
nd the order of magnitude of the maximum frequency 
nvolved here is given by” 


Vmax = 1/(duration of a symbol). (10) 


APPROXIMATION FOR LARGE G (Low NoIseE) 


Without precipitating to the limiting case o — 0, 
G — o, let us examine the cases where G is so large that 
the probability of error larger than the order of the 
sircumference is negligible. Usual applications of the 
theory actually lie in this domain. 

Under this assumption, all the errors propagating more 
than one circumference in one direction can be neglected. 
There are then only two possible paths to reach point r 
from Py, namely one propagating clockwise and the other 
propagating counterclockwise from P, to 7, neither being 
ionger than the circumference G. Furthermore, unless 7 is 
in the neighborhood of r = n/2 (which is the point 
diametrically opposite to P,), one of the paths is ap- 
preciably larger than the other. Due to the nature of the 
Gaussian distribution, the contribution to p(r) from the 
shorter path will then be considerably larger than the 
contribution from the longer path. Thus we need consider 
only the shorter path for those points. In addition to this, 
if G is large enough, it is easy to see that p(r) for r in the 
vicinity of r = n/2 is so small that its contribution to C 
in (8) is negligible compared with the contributions from 
the r’s which are closer to the original point P,. Writing 


r = ry (n/2 < ro < 0) for the point which marks the 
limit of the ‘‘vicinity” of the point r = n/2, one may 
thus write 
C = 9 + pO) log p(0) + 2 Di ptr) log pt), (11) 
with 
rd+(d/2) 
pO=n-—n=f[ (12) 
rd—(d/2) 


where the omitted integrand is given by (2). Since the 
contribution p(r) log p(r) from r beyond r = 79, whether 
or not (n/2) > r, is very small, one can write as well 


(Sp =1), 4a) 


r=-0 


C= 9+ D pl) log v0); 


2 This matter will be discussed more fully in a separate paper. 
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with p(r) given in (12). A careful estimation of errors 
shows that (13) is perfectly reliable for G > 10. 


APPROXIMATION FOR LarGcsE g (LARGE MULTIPLICITY), 
AND QUALITATIVE DESCRIPTION OF THE ACTUAL 
SITUATION 


We shall introduce first an approximation which is 
good for values of d(= G/n = G/2’) which are smaller 
than unity, with the restriction, however, that G > Go, 
where G, is the value of G beyond which (13) becomes 
reliable; 2.e., Go is somewhere in the neighborhood of 10, 
say. For a given noise level, z.e., for a given value of G, the 
approximation, therefore, will be good for large g. But for 
a given value of g, the approximation will become good 
for small G, within the limit G > G). 

If d < 1, we can write (12) as 


pr) = (d/ V/2r) exp [—(rd)*/2]. (14) 


The summation in (13) can then be evaluated as follows: 


X pl) log ple) = log a/-V2x) [102 de 


© 


— (1/2) log ef a f(x) dx 


log d — (1/2) log (2ze) 


= log G — g — (1/2) log (27). (15) 
Putting (15) in (13) we obtain 
C. = log G — (1/2) log (2ze). (16) 


Then the correct expression of the channel capacity, (8), 
or where applicable, (13) can be written as 


Cr = C4 A, (17) 
with 
A = >> p log p — [log d — (1/2) log (2ze)]. (18) 


For a given value of G, the larger the value of g, the 
smaller the correction represented by A. 
The condition 


d<«l (19) 
which is the premise for the derivation of (15), means’ 
G<2 or logG E g. (20) 


Actual calculation shows that A is negative, and | A | is 
less than 3 per cent of the absolute value of >> p log 
p = C, — g at d = 1. Thus, already for g = log G, the 
channel capacity C, of (17) can very well be approximated 
by C.. of (16). For instance, for G = 10, C, can tolerably 
be replaced by C.. for g equal to or larger than 3. 

For very large values of G, the constant term in (16) 
can be neglected, and C.. becomes, as anticipated, the 
logarithm of the snr:G = L/c. 


3 The symbol E is used in the sense that x E y means 24-7 >> 
i@ SO), a > Oy 
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Fig. 1—Channel capacity C of the ‘ring’? model vs the signal- 
noise ratio G for various g. 


For a qualitative understanding of the situation, we 
should note the following facts. Considering C, as a func- 
tion of G for a given g, we can use C., in lieu of C, as long 
as G, < G < 2’. AsG increases beyond 2’, C, will gradually 
branch off the curve of C., and level off toward the limiting 
value which is g. [See (9)]. For G smaller than Go, C, will 
approach zero as G decreases. [See (9)]. Fig. 1 shows exact 
curves of C, for various g (including g = o~). They are 
obtained using (8) and (8), or, where permissible, corre- 
sponding approximative formulas previously introduced. 

In a still coarser approximation, we can describe the 
situation as follows: the exact curve of C, for a given g 
lies very close to the curve C., so long as C., is less than g. 
Then the exact curve rather abruptly levels off at the 
limiting value g. A type of question which arises in practice 
pertains to how far we can improve the channel capacity 
by increasing g when the noise level G is given. A zeroth 
approximation answer to this question is as follows. For 
the given value of G, we calculate C.., and determine gy by 


Chi => lx Ge <4 Jo- (21) 


Then this go will give the smallest value of g which gives 
a channel capacity very close to the maximum capacity, 
C. Any larger value of g will not appreciably improve the 
channel capacity. For instance, for G in the domain 17 
< G < 33, C. lies between 2 and 3. Therefore, g larger 
than 3 will not improve the capacity to an appreciable 
extent in this domain. 

For a little more precise discussion, we can search the 
smallest g which satisfies 


Oo ~ CC. (22) 
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SrraignHT MULTILEVEL CHANNEL AND Ring MopEL 


The general idea underlying the proposed comparison — 
of the ring model mentioned previously to the original 
problem of a straight multilevel channel, is that we 
cut the ring at a point and stretch it straight and compare | 
the two end points on this strip to the ‘0’ level and 
the “1” level of the original binary channel. Thus, | 
we divide the circle G in n = 2’ (g > 1) equal intervals 
and cut it at one point and denote the distance between 
the two end points by D. The formulas obtained in the 
preceding sections were written in terms of G. It is more 
convenient hereafter to write them in terms of D which is — 
obviously G — d. Therefore we should substitute for @ 


G = D279 (2 Ae (23) 


The ratio of D to G of course tends to unity as we in- | 
crease g. 

The general expression (17) of channel capacity in the 
ring model is now rewritten as 


C, = log D — (1/2) log (2ae) + A+B, (24) 


where A (< 0) is given by (18), and B is given by 


Big 2 ee ee (25) | 


For very large g, we have 


C. = log D — (1/2) log (2ze). (26) : 


This C.. and C., of (16) becomes identical when g is really |, 
infinitely large. Otherwise, the difference is given by B. | 
Now, let us go back to the original problem. Suppose }, 
we are given a binary channel, in which a certain value |, 
qo and value 0 of a physical quantity q, e.g., voltage, power, | 
etc., are supposed to correspond respectively to the “1” | 
and the “0” of the binary system. It is assumed that this | 
physical quantity suffers during transmission fluctuation |, 
of the Gaussian type with a constant standard deviation 
o (in the same unit as g) about the originally transmitted 
signal strength of g. If the physical quantity q itself does 
not show a Gaussian fluctuation, we should adopt a | 
suitable function of q, for which the fluctuation is ap- 
proximately Gaussian. Let us measure the ‘1’ level; 
7.e., the value q of g by unit «. Then what was called D 7 
previously corresponds to 


D = q@/c. (27) th 


However, there is obviously a difference between the 
ring model and the original straight multilevel problem, 
namely, in the former model errors beyond one end 
“overflow” to the other end. In the original problem, it 
will be natural to assume that any received signal beyond 
an end-point will be registered as belonging to this end- 4 
point. This discrepancy will become very quickly negligible 
as D increases. In fact, the relative number of levels 


December 


participating in the “overflow”? becomes smaller as D 
. . “ff . q 
increases. For a given value of D, if we increase g, there | 
will be more and more levels participating in the ‘“over- }) 

y 


for the given value of G. Then any value larger than this 
g will give only an improvement within 10 per cent of C.. 
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flow,” but at the same time the number of levels which 
do not participate in the “overflow” will also increase 
proportionally. Therefore, the relative error will not 
appreciably depend on g (beyond a certain value of g) 
when D is once given. 

Let us compare first the straight binary channel and 
he corresponding circular binary channel. In the latter, 
any received value beyond (1/2) D will be registered as 
“1”, and any value below (1/2) D will be registered as 
“0”. Thus, the probability of error is 


pi) =f =a. 


Now in the ring model (with g = 1) with circumference 
G with two points D = (1/2)G apart, we have (for D > 1). 


(28) 


3D/2 (ee) 


2 =H 


D/2 D/2 


+0 kKG+D/2 

=>] = 2 (29) 
k=—o JkG-D/2 
| Thus, the capacities calculated with (28) and with 
(29), with the same D, are different. Therefore, when a 
straight binary channel with a certain probability of 
error /p is given, it may be reasonable in this ring approxi- 
mation to determine D by equating this p to 2a instead 
of a. However, on account of the nature of the error 
function, the value of D obtained by setting p = a and 
the one obtained by p = 2a do not differ appreciably. 
For instance, when the latter choice gives D = 10, the 
former choice gives D = 9.6. A difference in D of this 
order of magnitude does not result in an appreciable 
difference in its channel capacity. For larger values of D, 
the difference is entirely negligible. 
Fig. 2 gives 2a as a function D, which will help to 
evaluate the appropriate value of D to be used in the 
multilevel problem. Fig. 3 shows C, as given by (24) and 
\C.. as given by (26), both as functions of D.* 


ERRORS DUE TO “OVERFLOW”’ 


The error committed by substitution of the ring model 
for the original “straight” channel is due to the overflows. 
One way to subdue this error is, as suggested in the pre- 
ceding section, to use a modified value of D for the ring 
model; z.e., to determine D by setting p = p(1) in (29) 
instead of in (28). A more consistent way to attack this 
problem is to use the same D for the ring model and the 
priginal problem and to evaluate the difference in the 
resulting channel capacity. 

For smaller values of g, direct calculation of this differ- 
ence can easily be done. For instance, in the binary 
phannel (g = 1), we can obtain the capacity C’ of the 
straight channel by putting (28) in (1) and the capacity 
"i of the ring model by putting (29) in (1). 


_ 4When G is used as abscissa as in Fig. 1, a curve with larger g 

does not come below a curve with smaller g. This situation is no 
onger a general rule when D is used as abscissa as in Fig. 3. It may, 
owever, be expected that this situation will be reestablished even 
ith D as abscissa if the overflowing errors are stopped and the 
’s in (5) are suitably chosen. 
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Fig. 2—The logarithm (base 10) of the binary channel error p 
against the signal-noise ratio D, where p and D are connected by 


p=2] fw) ae 
D/2 
with 


f(z) = (1/-V2m) exp (—2"/2). 


SO 


‘000 


—— D 


Fig. 3—Channel capacity C for various g of the “straight’’ multi- 
level channel as estimated by the “‘ring’’ model, plotted against 
the snr D, 


The relative error 
e=i(Ce— O/G 


turns out to be of the order of — a log a (if a is sufficiently 
small). For D = 10, we have a = 3 X 10 ‘ande = 10°. 
In the case of binary channels, for small values of a, the 
capacity C as well as C’ is very close to that of a noiseless 
channel, 7.e., unity. 

For g larger than unity, in the straight channel, P(7) of 
(7) is not necessarily unity, namely, the end-points have 


(30) 
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larger values of P(j) than the other points, since erroneous 
signals tend to accumulate on the endpoints. Thus, we 
have to use (6) instead of (8). We assume w; in (5) to be 
the same for all 7. 

For instance, in the case of g = 2 we have four points, 
say, A, B, C, and D in their order, where A and D are the 
end-points. The P(A) of (7) will then consist of p4(A) + 
pe(A) + pc(A) + pp(A), where, for instance, p;(A) will 


be given by 


fo) 


pal) = 


since the distance between A and B is D/3. Actual calcu- 
lation for D = 10 gives the straight channel capacity, 
C’ = 1.54. The corresponding ring channel capacity 
according to (24) with the same D is C = 1.46. The 
relative error (C’ — C)/C is thus less than 6 per cent. 

This example illustrates incidentally an important 
fact about the multilevel channel in general. Since the 
error in the binary channel is very small (& 107’), for 
D = 10, the capacity of the binary channel is practically 
equal to the noiseless capacity. However, dividing the 
interval only into 3(= 2’ — 1), the probability of error 
already becomes quite considerable; e.g., p(1) = 5 X 10°, 
resulting in a channel capacity appreciably below the 
noiseless capacity which is 2 in this case. This is evidently 
a consequence of the nature of the Gaussian distribution 
which decreases very fast with increasing deviation 
beyond the standard deviation. 

We can compute the relative error « = (C’ — C)/C 
committed by the “ring” approximation for any given 
values of g and D by extending the method used in the 
above example. However, since it is rather complicated 
to derive a general formula, we propose to substitute for 
the Gaussian fluctuation another simpler type of fluc- 
tuation, for which we can easily derive an exact expression 
for any g and D. The capacity calculated with this simpler 
type of fluctuation coincides with the Gaussian case 
when g is very large. Furthermore, since we can obtain 
the exact formula for e, we can learn the general nature 
of the dependence of ¢ on g and D, which must be more or 
less the same in the case of Gaussian fluctuation. 

When there are a great number of levels, we can expect 
that the result based on the Gaussian distribution 
with standard deviation unity can be approximated by 
the use of a square distribution whose half width is of the 
order of unity. Thus, we propose to try a probability 
distribution of the following type: 


f(z) 
{(@) 


Since f(x) of (82) covers a domain of length 4, (4/d) = 
(4/G@)2’ will be the number of subintervals over which the 
error will extend. Therefore, if we define a parameter b by 


b + (1/2) = 2/d = 2°™/G, 


(31) 


1/4 for 
0 for 


= Dy ee 
SE oh (32) 


I 


eo 2: 


(33) 
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we can roughly interpret b as the number of neighboring 4 
levels on one side to which the error will extend.” We can 
now consider any pair out of three variables, g, G, and 6 
as independent variables. For a given value of g, if we — 
increase G, there will be a certain value of G beyond which 
b defined by (33) becomes negative. Therefore, we have to— 
supplement the definition (33) by 


b= 0° for’ Gee (34) . 


The transition probability will be 
p = 1/(2b + 1) 


for the (2b + 1) neighboring levels and will vanish beyond 
this domain. : 

The channel capacity of the ring model according to this 
square distribution is easily calculated with the help of 
(8), (83), @4), and (85):° 


(35) 


C=logG—2 for G < 2°”, 


G > Dae 


(36) 
C= 9 for 


For a given value of G, if we take a large enough g such 
that g > log (@/4), we get C = log G — 2 which agrees 
very nicely with the C for the Gaussian noise: C = log 4 
G — (1/2) log (2 ze) = log G — 2.047 which is supposed 4 
to be correct for g J log G. [See (20)]. At G = 4, C becomes © 
zero, since the errors then cover the entire circle.) 
Eq. (36) corresponds precisely to what was called “zeroth } 
approximation” at the end of the third section. We can ° 
rewrite all these formulas in terms of D instead of G by | 
the use of (23). 

Now we cut the ring at one point, and assume that the 
“overflowing” errors in the ring model are now registered 
by the receiver as the signal at the end points. The points 
in the central part, whose number is n — 2b, are not 
affected by this modification, since the errors which reach 
these points in the ring model do not originate from 
points beyond the end-points. 

The resulting channel capacity C’, by the use of (6) is 
given by 


cr 6 = tog 2b + 1) 


_ OF DOFD) OF HGHLD 
n(2b + 1) 8 2 


toga (d - S ) vlog. (37) i 


r=1 r=b+2 

where C' is the capacity of the ring model given in (36), 

eG tea 
For small values of b, we can easily calculate the relative 

error e = (C’ — C)/C by this formula. Thus, for b = 1, 


5 To make it possible for both 6 and g to be integers, one would 
have to define b by 6 < (2/d) < b + 1. But, this will not change the 
general features. 

5 Assuming 6 to be an integer. 


e get 
Rear tele 78 ss / on 
sean a3 / C8 0333), 
3 
= eu! (3 D+ i). (38) 
nd for b = 
2.074 D 
a0, 208g 5 # 2 
= log (? D+ 1), (39) 
‘For b > 2, it can be shown that 
b+1 
pee SA )r log r & —(b’/2) log (16b?/e) — (40) 
1 r=b+2 
in (37), and we obtain 
72443 i a z : ) ; 
Sanaa p log | Gea log Oe ely (41) 
Fig. 4 represents ¢« as given by (41) as a function of D. 


f 


_ The main purpose of the use of a square distribution is 
‘to determine the behavior of ¢ as a function of g and D. We 
can notice from (38), (39), and (41) that e does not 
depend appreciably on g, as has been foreseen in the pre- 
ceding section. In particular, when g is larger than log 
(& D + 1), eis practically constant for a given D. We can 
‘recognize also in all three formulas (38), (39), and (41), 
‘that ¢« decreases as 1/(D log D). For large values of D, 
all three expressions give « & 2/[D log (D/4)]. 
' The numerical value of the relative error in this approxi- 
‘mation is rather high compared with the Gaussian case. 
|For instance, for D = 10, (38) (which implies g = 3.1 
there) gives « = 11 per cent. For the same D, (39) (g = 3.8) 
gives e« = 13 per cent, and (41) (g J 3.8) gives e« = 18 per 
icent. In the Gaussian case for D = 10, g = 2, we had 
= 6 per cent. However, the numerical values are not of 
importance here. The essential fact is that the order of 
magnitude of « remains unchanged for a wide range of 
variation of g when D is once determined. 
We may infer justifiably that these main features remain 
the same in the Gaussian case, namely, the relative error 
must be very insensitive to the values of g for a given 
value of D, and the dependence of « on D must be quali- 
tatively given by 1/(D log D). Thus, we can expect that 
for values of D larger than 10, the relative error will 
rather quickly decrease with increasing D, starting from 
a value of the order of 6 per cent for D = 10, irrespective 
of the value of g. We can therefore conclude that the 
ring model is a good approximation for the problem 
of multilevel channels. The order of magnitude of the 


| € 
is 
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Fig. 4—The order of magnitude (logio «) of the fractional error e 
committed by using the “ring’’ model for the “‘straight’’ multi- 
level channel, plotted against the snr D. 


error committed by this model can be estimated by the 
method explained previously. 

To cope with the assymmetry introduced by cutting 
the ring to obtain the straight multilevel channel, we 
can think of two methods which may lead to a certain 
improvement of the capacity. We may vary the weight 
w; in (5) which has been assumed to be uniform, and/or 
vary the subintervals which have been assumed to be all 
equidistant. However, since the effect of the asymmetry 
has been shown to be very small, the change in capacity 
brought about by these methods will also be very small. 


CONCLUSION 


As one of the practical conclusions that can be drawn 
from this paper, we may repeat what has been stated at 
the end of the third section. Namely, when the snr 
is given, 2.e., when the probability of error of the original 
binary channel is given, there is a certain number of 
levels beyond which the channel capacity does not ap- 
preciably increase any longer by increasing the number 
of levels. A quantitative evaluation of such a number of 
levels has been given in the text. 

From a theoretical point of view it is interesting to 
note that (16), (26), and (36) substantiate the expectation, 
mentioned in connection with (10), that the capacity of a 
multilevel channel with a very large level number will 
approach that of a continuous channel. 


ACKNOWLEDGMENT 


The author wishes to express his thanks to Nathaniel 
Rochester for drawing his attention to the problem of 
multilevel information channels. Appreciation is also 
expressed to Mr. Rochester and other members of the 
Information Research Division of the IBM Research 
Center for their stimulating discussions. 


AY 


220 


IRE TRANSACTIONS ON INFORMATION THEORY 


December '} 


A Generalization of a Method for the Solution of the 
Integral Equation Arising in Optimization of Time- 
Varying Linear Systems with Nonstationary Inputs" 


MARVIN SHINBROTY{ 


Summary—A new method is presented for the solution of the 
integral equation which arises in the optimization of a system in the 
presence of noise when the inputs are not stationary. The method 
depends on the correlation functions satisfying a certain condition 
which, fortunately, is frequently satisfied in practical situations. A 
simple example is presented to illustrate the method. 


INTRODUCTION 


MAJOR limitation of Wiener’s [1] important 
AN eon of system optimization is the restriction to 

stationary inputs. Since in many practical 
problems—those of optimization of missile guidance 
systems, for example—the inputs are decidely not station- 
ary, it would be beneficial if a general theory were de- 
veloped which did not make this assumption. 

The first step in this direction was taken by Booton [2] 
who derived an integral equation which a system with 
nonstationary inputs must satisfy if it is to be optimum. 

Naturally, this integral equation is not of much use 
unless it can be solved. The extreme generality of form 
displayed by the equation Booton derived, however, 
precludes any real hope of solving it as is. Thus, the real 
problem which remains is one of delimiting conditions 
which, while mild enough to be satisfied in many practical 
cases, remain sufficiently restrictive for the integral 
equation to be solved. 

Two such conditions were presented earlier [3, 4]. 
However, one of these two conditions had no clear physical 
interpretation. Consequently, it seemed desirable to see 
if it could be eliminated. This elimination was reported 
earlier [5] under the hypothesis that the noise is white. 
In the present paper, it will be shown how the equation 
can still be solved without making the offending assump- 
tion and without hypothesizing white noise. 


THE INTEGRAL EQUATION FOR THE OPTIMUM 


In [4] the integral equation which a _ time-varying 
impulse response of a system with nonstationary inputs 
must satisfy, if it is to be optimum, was shown to be of 
the form 


Salto if UME MERA ICE oye eee 


where g,; and g;; represent correlation functions, the 
former of the desired output » with the input 7 to the 
system, and the latter of the input with itself. It is the 


* Manuscript received by the PGIT, March 4, 1957. 
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purpose of the following to solve (1) with as few conditions | 
on these correlation functions as possible. 


ASSUMPTIONS 


As mentioned in the Introduction, it is necessary to | 
make some assumptions in order to be able to solve (1). | 
The following was assumed in [8, 4]: 

7) The correlation functions g;; and ¢,; have the form | 


exll, ) = DY aylb,(0) (2) | 
ull, ) = Yes b,() 


for ti 
a) If a, and b, have the same meaning as In assumption 
z), then the function 


w(t, T) oo SS [a,(t) b, (7) a a,(7)b,(0)] 


Dp 


is a function of t — 7 alone: 


w(t, 7) = w(t — 7). 


Now 72) is very natural and will be true in many 
practical problems. Even if it were not, it would generally 
be possible to construct an approximation for which it is 
true. This is discussed by Shinbrot [4, 5]. On the other 
hand, this is not true of assumption 72). It is not at all 
clear exactly what it means physically to say that assump- 
tion 22) holds in any given problem. This report will pre- 
sent a method for solving (1) without using assumption 77). 

Incidentally, it might be mentioned that there is no 
loss in generality in having the same functions b,(r) 
occurring in both (2) and (3). Suppose we had a problem 
with 

Gis = tr 

Cui ae ee 
for t > 7. Now, ¢;; and ¢,; considered as functions of + 
are certainly very different. However, we can write | 


a,(t) a t, a(t) aa 0 
bi(r) = 7, 6,(7) = 7° 
c(t) = 0, c(t) = t 


and with these functions a, b, c, (2) and (3) hold with 
P =,2, 
One further comment is necessary before we turn to the 


1957 


‘hat this function is always symmetric: 


gilt, 7) Ta G(T, t). 
| ence, (2), which is true for ¢ > 7, implies that 


P 


vit, 7) = dl a,(b,(t) for t <r. (4) 


p=1 


_ In what follows, vector notation will be used. Set 


A(t) = (a,(t), «++ , ap(d)) 
B(t) = (b,(t), has bp(t)) 
CY) = ©), --- , er). 


(Then, assumption 7) can be written in the alternative 
form (z bis). The correlation functions ¢;,(t, 7) have the 
Form 


git, 7) = A(t) -B(r) 
guilt, T) = CD) -B(7) 


wvhere the dot denotes the ordinary scalar product of the 
wectors. Eq. (4) shows that (5) implies 


g:(t, 7) = A(r)-BO for 


We state now that assumption (2 bis) will be made 
throughout this report. 


for’ <t 2% (5) 


US is (6) 


SOLUTION OF THE INTEGRAL EQUATION 


In the notation of assumption (7 bis), (1) becomes 


(BO) = AC): | g(t, &)BO) de + BO) 
0 
| 
| -[ g(t, )A(@) do for OSr<t 


Write the second integral of (7) as fi — f% ; then, after 
rearrangement, this equation becomes 


| 
[ew - fot, oa ao |.) = | alt, NAG 


; -B(c) — A(o):B(r)] do for O< 7 <t. (8) 


_ We now attempt to write g(t, 7) in the form 
| g(t, 2) = k(t, Dut — )+AMAt— 7) — @) 


where u(t) is the unit step function: u(t) = 0, t < 0, 
u(t) = 1,t > 0, and where 6 denotes the Dirac 6 function 
[7]. Note that the function u(t — 7) enters because we 
wish g(t, 7) to be physically realizable. 

a The fundamental property of the 6 function that 

| b : 

[ 186 - 9) do = \f0, MEGS USN 


can then be used to show that (8) becomes 


0, otherwise 


EG a I Toone ae | Rin= / ele 


-B(c) — A(o):B(r)] do, for O< +r<t (10) 
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Dit) = CO — ADAM. (11) 

Now, set 
k(t, 7) = G@)-T(s) (12) 


where G and I are vectors. Using (12) in (10), we see that 


| De at is k(t, 0) A(o) in| WHE. = Gof P(o)[A(2) 


‘B(o) — AG) BG)l dou for On eee 


This equation is certainly true if 


Dit) — : “Miso Meds = Cate ee 


Ba) = f TOG) Bey = AG Bee 


tore Obeereccomt: (13b) 


The solution of (1) has now been made to depend on 
the solution of the system (13). The process of solving 
(13) can be broken into three parts: the determination of 
IT, of G, and, finally, of the function h of (11). 


DETERMINATION OF [ 
To find I, define the vectors 


E(r) = (a,(7), +++ , @p(z), dil), «+ , be(7)) 
F(z) a (b,(7), ie eee bp(r), a ¥ a,(7), 5 oe eae ap(r)) 
In terms of these vectors, we have 


A(r)-B(o) — A(e)-B(r) = E(7)-F(o). 


. (14) 


Hence, if I has the components y,, ---, y,, (13b) can be 


written 


b() = EQ) | Fe) rul6) do 


for - Oo ei pd eer 
or 
2P : 
bn) = Lele) | flere) de 
for .OY< 7. <4 p= Ti ee 
where 
@q(r) = a,4(7) 
fat) = baz) 
fOr —=aleeencm er wand. 
eat) = b.-p(7) 
fa(t) = —a,-p(7) 
forg= P+1,--- , 2P. 
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Now, it is entirely possible that the functions e,(r) are 
not all linearly independent. In this case, certain of the 
terms in the sum on the right side of (21) can be collected 
together until an equation of the form 


BO) = Lele) [ eor(o) do 


for OS 60D = le ee (16) 


is obtained where the functions e, are linearly independent. 
This process will be illustrated in the last section. Eq. 
(16) can be reduced immediately to a system of differential 
equations. In fact, differentiating (16) r times gives 


Q ce 
Ha) | ealoyr(o) do 


= 9) — E(7) Bel Oworeor” 


s=1 


LOT OF Spee Oh ee (17) 


where (") denotes the binomial coefficient: 


(") ee nee 
s/ slr — 8)! 
Eq. (17) with r = 0, 1, ---, Q — 1 represents Q simul- 
taneous equations in the Q unknowns fj ¢,(c)y,(a)de, 


q = 1, -:-, Q (p fixed). Furthermore, the coefficient 
determinant 


det (€” (18) 


is never zero, since we know the functions e, to be in- 
dependent, from which it follows that their Wronskian 
(18) is different from zero. Consequently, (17) can be 
solved for the integrals 


[ ex ae 


in terms of the functions y,(7) and their derivatives. 
Differentiating these solution equations once more results 
in a linear differential equation for y,(7). 

This of course still does not determine the y, explicitly, 
for it can easily happen that the differential equations may 
not be explicitly soluble. If they are (and this frequently 
happens in practice),.we can now go on with the solution, 
but even if they are not, much is known about finding 
approximate solutions of differential equations—far more 
than is known about approximations to solutions of 
integral equations. 

The functions y, have now been determined with the 
exception of certain constants which occur when the 
differential equation for y, is solved. These constants can 
be determined by substitution of the expression y, into 
the original integral equation (13b). 


DETERMINATION OF G 


Thus, y, can be found completely. To find the com- 
ponents g,(t) of the vector G(t) occurring in (12), consider 
(13a). Use of (12) in (13a) gives 
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400 = a) + Y a40 | o)a,le) do 
(19) 


Since the functions y, are now known, (19) can be con-— 
sidered as Q simultaneous equations to be solved for the 
Q unknowns g,(t) as functions of h(t) since, according to 
(11), d,(t) depends on h(t). 


for f >.75 Die 


DETERMINATION OF h 


It remains to show how h may be found. The functions 
g(t) have been determined in terms of the functions 
d,(t); however, these functions are not completely known — 
since the function A(t) occurring in (11) has not as yet 
been specified. On the other hand, it can be seen from what 
has gone before that whatever this function h may be, 
the function (9) with k(¢, 7) being given by (12) and the 
vectors G and I being determined by the method just | 
described satisfies the fundamental integral equation (1). 
Consequently, we may say that we have found an infinite 
number of solutions of (1). How, then, do we specify the 
function h? 

To answer this, consider the functions y,(7). It can be 
seen that they have been determined independently of h. 
Also, since the functions b,(7) of (16) are zero for t < 0, it 
can be seen that the differentiated equation (17) will gen- | 
erally include some derivatives of 6 functions on the right-_ 
hand side. In fact, assuming that the functions b, them- | 
selves do not involve 6 functions, it can be seen that the | 
differential equation for y, will in general be of order Q — 2. | 
On the other hand, the Qth derivative of b, will occur in it, | 
which means, since b,(7) may be discontinuous at 7 = 0, 
that 6°°°"(r) will probably occur in the differential 
equation. Hence, the solution for y,(7) generally will 
depend linearly on the first derivative of 6(7). This means 
that the optimum system may involve a differentiator. 
Since it is difficult, if not impossible, to differentiate 
a noisy input, it is highly desirable that this term 6(r) be 
eliminated. The function h(t) can be used for this purpose. 

To see this, consider (11) and (19). From these equa- 
tions, it follows that the g,(t) depend linearly on h(é). 
Hence, from (12), the 6(r) terms which occur in I(r) will 
be multiplied by ‘a linéar function of h(t); 7.e., k(t, 7) will 
contain a term of the form [a(t) + @(dA(d)]é(r) and, 
furthermore, 6(r) will appear nowhere else in the expres- 
sion for k. Thus if A(t) is set equal to — a(t)/B(é), the 
optimum will not involve any differentiators. This process 
will be illustrated in the next section. 


EXAMPLE 


To illustrate the method of the preceding section, we 
now present a detailed example. 

Consider the problem of determining the position of a 
moving particle when the measurements are corrupted 
by noise. In order to simplify the problem to the point — 
where the method will not be obscured by the details of | 
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1e calculation, we make the following assumptions: 

a) The particle moves along a given straight line with 
constant (but unknown) speed. 

b) The particle can come from only one direction along 
this line. 

_c) The time when the particle first appears—at the 

limit of our measuring devices, say—is known. 

d) The noise and the particle position (message) are 
independent. 

e) The noise is stationary and is described by an auto- 
correlation function of the form 


Cnt; 7) = Nene (20) 


Define a coordinate system with one axis along the line 
bn which the particle moves and with origin at the point 
}f its appearance. Then, under the assumptions made, a 
typical message will be 


m(t; a) = at 


where a is the speed of the particle. Hence, we can compute 


Cat, T) Av(at-art) 


Ww 
=altr 


where a is the expected mean-square velocity of the 


barticle. We assume an approximation to a” is known. 


| By d) and e), we have 


Cnty T) = Cmmit, 7) 
| = alr 
vhile from e), 
Pill, 7) = Ommlt, 7) + Panlt, 7) 


=D Fi 
Aipiake se, 


_ Since the problem is one of filtering, the desired output u 
s the message m. Thus, in notation of (7 bis), we can set 


A(t) = (ot, | (21a) 
B(t) = (t, e”') (21b) 
C(t) = (@’t, 0) (21¢) 
for t > 0.” Thus, from (14), 
E(r) = (a’7, de", 7, *”) 
F(r) = (1, €, — a’r, — de®’). 
Therefore, (15) becomes 
b, (7) = a'r i oy,(o) do + re *" / ey (a) do 
0 0 
eat / oy,(o) do — de” / e °’y,(a) do 
0 0 
for 10 <r (22) 


1 All of these assumptions can be eliminated. 

2 Note, incidentally, that for this simple example, assumption 
2) is fulfilled and so the simpler method of Shinbrot [3, 4] can also 
be used to solve it. 
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Now, clearly, the components of H(7) are not linearly 
independent. This fact manifests itself in (22) by the 
cancellation of the first and third integrals on the right- 
hand side. Thus, (16) is 


cee i ni do =e i Se One 
0 0 


for "Onset: (23) 


Differentiate this equation. We obtain 


rT 


5 nGyne / A alee He” | ata) Oke 
0 


0 


for O 7 <2 F (24) 
Adding the two preceding equations results in 
[eta ao =F] 9 — ti, | 
Rica Pn anoy Ns ge 
so that 
im B°b,(1) ea by(7) 


It is important to note that the same result would have 
been obtained had we, instead of adding (23) and (24), 
subtracted them to get an expression for fj € °’y,(a)dc. 

We now write, from (21) and the fact that B(r) = 0 
forr <= 0; 


by(r) = ru(7) 
b,(r) = e’u(7) 


where u(r) is the unit step. Hence, from (25), 


n(a) = Be = 4) 
| 28 (26) 
o(7) ad _ 6(r) ae | 


We now turn to the computation of the functions g,(¢). 
From (2la) and (26), we find 


[ ane ar =e 
X 1 t ee 
[ alan) dr = age 
t Be 
i ai()ra(7) dr = aay 


if GC WiG) Caen 


In the third of these equations, we have used the fact 
that 


[ tia dar = fo 
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to prove that 


[ fender == FO, 


70 


Consequently, (19) becomes 


— 


Ba’, eo a 
(1 ten ad DB go(t) = 


Belge =f 
28 i(t) = d,(t). 
Hence, 
26e"' d,(t) 
gi(t) aaa et IC) 
bay Jet ~ (27) 
28 28° 6\ + Ba’l? | 
BG ere l t —"* 2 
a) = dQ + Ea 
Now, from (11) and (21), 
d,(t) = a t{l — ae (28) 
dt) = —)dh(te** 


Notice that y,(7) does not contain the derivative of a 
6 function. Thus, the optimum impulse response will not 
involve a differentiator if and only if g.(t)— which multi- 
plies y.(r) in the expression for g(t, r)—is zero. Setting 
g2(t) from (27) equal to zero and using (28) gives 


NOR Bal + 6) 
6B + 807t + 3607? + B’o’t? 


With this function h(t), substitution of (28) in (27) gives 


(29) 


2Br 
IE aee 


g(t) = 0. 


Hence, from (9) and (12), the optimum impulse response is 


g(t) 


h(t) 


December 


h(t) _ 
1+ Bt 


Ghia) [B’ru(t — 7) 


+ (1 + BH) dt — 7) — 87)] BG 


where A(t) is given by (29). 
For this example, the minimum mean-square error can 
be found to be 


iphones ete at | rg(t, 7) dr 
0 


7 6B? 

6B + 3a°t + 3Be7? + Ba’ t’ 
mw 6A a t —> © 

ee 


Note that this says that by waiting long enough, the 
approximation to the particle’s position can be made as’ 
good as desired. 

The impulse response (30) can be interpreted in terms: 
of a transfer function. This problem is considered else- 
where [4]. | 
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Noise 


MARVIN 


Summary—In a recent article! Johnson presents an asymp- 
tic formula for the output noise power of an optimum filter 
esigned to make a zero-lag estimate of either the input or its de- 
vatives. It is assumed that the input function consists of a non- 
dom polynomial plus stationary uncorrelated noise. 

It is the purpose of this paper to present an exact formula for the 
utput noise power for the same input model. The formula presented 
more general in that the estimation can be for any lag a with 
spect to the latest data point. 

Tables and graphs of the root mean square error for the zero-lag 
‘stimation of the Oth, Ist, and 2nd derivative are presented as a 
ction of the input polynomial up to degree 5 and memory spans 
ip to 100 sample points. A comparison is made of the relative error 
root mean square using the asymptotic formula derived by 
Johnson. 


| INTRODUCTION 


N a paper on the same topic by the author,’ a general- 
| ization of the problem of determining ihe ordinates 

of the weighting sequence of the optimum linear 
liscrete filter is considered. The special case for the 
polynomial plus white noise is considered and the formula 
‘or the weighting sequence is obtained in (49) to (53). 
‘An equivalence between the optimum filtering problem 
and minimum variance curve fitting techniques is proven. 
i the purposes of simplicity of derivation, it will be 
simpler to utilize the concepts of least squares curve 
Hitting since the weighting sequence is invariant under a 
time translation and may be determined over a standard 
interval. 


ANALYSIS 


Consider a set of equally spaced data points (u, y.) 
w= 1, 2. , M. The problem is to fit a least squares 
polynomial of degree n to these points and to estimate the 
Kth derivative of-the observed:data from the curve fit 
at any point on the w scale. 

For the purpose of the analysis it is convenient to 
utilize orthogonal polynomials in the curve fitting pro- 
cedure. Thus let the true polynomial be given by 


n 


Ss aréL(u), 


L=0 


Re) = 


* Manuscript received by the PGIT, March 22, 1957. 

{ Convair, San Diego, Calif. 

BIR Ut Johnson, “Optimum, linear, discrete filterings of signals 
containing a nonrandom component, » TRE Trans., vol. IT-2, 
PP., 49- ape 1956. 

2M. ‘An extension of the minimum mean square pre- 
diction ey for sampled input signals,’? IRE Trans., vol. IT-2, 
pp. 176-184; September, 1956. 
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n the Mean Square Noise Power of an Optimum Linear 


Discrete Filter Operating on Polynomial Plus White 


Input" 
BLUMt 


where the polynomials £,(w) are orthogonal,’ e.g., satisfy 
the following relationships 


Y EWE) = 0, he L, Q) 
> £20) = SD, M). 3) 


u=1 
It is assumed that the observations y, are given by 
Yu = Plu) Ar Nlu). 


The N(uw) are assumed to be random, stationary, and 
uncorrelated errors. 

Then the least squares estimates d; of the coefficients 
a, are obtained by minimizing 


M n 2 
r= [Sato - »| o 
u=1 L=0 
with respect to each of the parameters 4. 
Thus one obtains 
BS al a.8.00 — v. fut = 0, 
=0 
ye WAM Re 8 (5) 
Solving (5) for @; one obtains 
M 
d, = DY yobs (u) Deathcore i ) 


SS Sie 


By substituting (6) one obtains the curve fit relationship 


L=0 
To evaluate the estimate of the Ath derivative at 
= M + a one need only take the Kth derivative of 
both sides of (7), (considering wu as a continuous variable) 


and obtain 


ad*YV(u a* 
a u=M+a = Vora — ys a du* éu() u=M+a- 8) 
Let 
oe c 
a0 | | =e ae +0), () 
U u=Mt+a 


3R. A. Fischer and F. Yates, “Statistical Tables for Biological 
Agricultural and Medical Research, ” Oliver and Boyd, Edinburgh, 
Scotland; 1938. 


226 IRE 
and substituting (6) into (8) one obtains 


se s ér(uén (M+ a) 


r(K) eS 
Y°"'(M + a) = phe S(L, M) (10) 
Let 
Z = E,(WEr (M + a) = 
Wea pas S(L, . edt. aD) 
then 
yw M ; 
Yu+;(M Qa j) = oe W M-—uYuti» 
u=1 
=O0+1+2-:-: (12) 


Eq. (12) is directly interpreted as the input-output 
relationship of a digital filter with weighting sequence 
Wo, W,, -°:, Wai. The input is the sequence y,.; and 
the output is Yy,,; (WM + 7 + a). The output is available 
in real time after the last data point is sampled and 
estimates the Ath derivative of the input at wu = 
M+ a+ j. The filter has a finite memory over the 
interval (WM — 1)T. 

Since the estimators d@,; are unbiased, the error in esti- 
mate is given by (j = 0), 


3 


Ne E Yoro(M + a) — 2 uk (M — 2) | 


A = Ss W y-.Nu). (13) 
u=1 
The mean square error of estimate is given by 
M 
oR = On SS ae as 
u=1 
(oy = noise mean square error). (14) 
q 
Substituting (11), (2), and (3) into (14) one obtains, 
emia = ye Gra 
ox = 6 (M, Qa, EG n) = es S(L, M) (15) 


which is the main result of this paper. 

Eq. (15) has been derived for unit time between 
samples. If the interval between samples is given by 7 
then (11) is modified as follows: 


Waes = cg Wow (16) 
and (15) becomes 
ny B56 (Ma Kn) (17) 
Note that 
8(M,a, K,n) = 5(M,a, K,n — 1) 
4 EOL sal) ee 


Sa,-M) ’ 


so that increasing the degree of curve fit is never associated 
with a decrease in og since the second term of (18) is 
positive definite for fixed M, a, and K. 
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SPECIAL CASE 


Special formulas for 6° are as follows. Let K = 0, 
a = 0.1) =o soe, mmren , 


6(M,a,0,n) = 


Asan example, when a = 0, K = 0, one obtains a zero-lag 
estimate of the input. The mean square error output is 
then proportional to Wo, the coefficient which mula 
the latest data point, e.g., 


2 
oA — on: Wo. 


Other relationships on the 6 which may be useful are 
as follows. Let the order of the derivative equal the order 
of degree of curve fit, e.g., K = n then 


(n!)° 


Sin, M) (18) 


&(M,a,n,n) = 

and is independent of a. 
Let the order of the derivative equal one less than the 
degree of the curve fitting polynomial, e.g., K = n — 1, 


then 
8(M,a,n — &(M,a,n — 


1,n) = 1,” — 1) 


na ES s*(M,a, n,n) (19) 


foty 
S(n, M) 


— [nm —)!P +?Ae= Ly 


HS (eae ON 5 (20) 


When a = — [(M — 1)/2], e.g., the midpoint of the 
curve fitting interval, 


6(M,a = =|¥— Ln = 1, n) 


= 6(M,a,n — 


1,1 em 


This represents the minimum 6° obtainable with respect 
to a. 
For a « M, the asymptotic value of 6 is given by 


6 (M, a, K, n) 
Tekapo sod ie 
(NE) Se (Kk)! 


IIe 


? 


K = 0; 152) iy, Seek eaie (22) 


Tables I and II, and Table II], p. 228, present the exact 
values of 6 using (17) and the percentage relative error 
in 6 using (22). Figs. 1 and 2, p. 229, and Fig. 3, p. 230, 
present a plot of 6 using (15) for M = 10 to 100 for pur- 
poses of interpolation. 

Kq. (17) is identical with the results one would obtain 
from Blum’ as are the values of the weighting sequence. 

The interpretation of the parameter a@ is as follows: 
when a = 0, one obtains a zero-lag estimate with respect 
to the latest data point, when —(M — u— 1) <a> Oone 
obtains an extrapolation, and when (M — 1) < a < Oone 


Co 
=. 
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TABLE I 


Table of R(M, a, K, n) fixed w for evaluating the relative errors using the asymptotic root mean square formula compared with zero-lag 
= 0) estimation of the input (K = 0) as a function of the degree of the curve fitting polynomials (7) and the number of data points (JZ). 


R= pemiotie 6 — exact 6] X 100 = per cent error in 6 using asymptotic formula. 
) exact 6 

Table of 6(M, a, K,n),« = 0, K =0 for evaluating the root mean square error for zero-lag (a = 0) estimation of the input (K = 0) 
s afunction of the degree of the curve fitting polynomial (n) and the number of data points (7). 


iS? 0 1 2 3 4 5 
5 0.70711 
2 R 0 
5 0.57735 0.91287 
3 R 0 26.5 
5 0.50000 0.83666 0.97468 
4 R 0 19.5 53.9 
5 0.44721 0.77460 0.94112 0.99283 
5 R 0 15.5 42.6 80.2 
5 0. 40825 0.72375 0.90633 0.97996 0.99801 
| 6 R 0 12.8 35.1 66.6 104.5 
5 0.37796 0.68139 0.87287 0.96362 0.99348 0.99946 
7 R 0 10.9 29.9 56.9 90.2 126.9 
5 0.35355 0.64550 0.84162 0.94548 0.98665 0.99796 
8 R 0 9.54 26 .0 49.6 79.2 112.6 
5 0.33333 0.61464 0.81278 0.92660 0.97800 0.99533 
9 R 0 8.46 23.0 43.9 70.4 100.9 
5 0.31623 0.58775 0.78625 0.90762 0.96802 0.99157 
10 R 0 7.61 20.7 39.4 63.3 91.3 
5 0.22361 0.43095 0.60892 0.74985 0.85231 0.92022 
20 R 0 S077 10.1 19.3 31.1 45.8 
5 0.14142 0.27865 0.40784 0.52578 0.63011 0.71946 
50 R 0 1.50 4.02 7.59 12:2 17.9 
5 0.10260 0.20359 0.30142 0.39471 0.48223 0.56299 
| 95 R 0 0.790 O00 Or 6.38 9.34 
| 5 0.031607 0.063167 0.094632 0.12596 0.15709 0. 18800 
(001 R 0 0.0749 0.1996 0.374 0.5998 0.87517 
TABLE II 


SF Se 


Table of R(M, a, K, n) fixed a for evaluating the relative errors using the asymptotic root mean square formula compared with zero-lag 
‘a = 0) estimation of the first derivative of the input (K = 1) as a function of the degree of the curve fitting polynomials () and the 
number of data points (/). 
| R= pee 6 — exact 64 X 100 = per cent error in 6 using asymptotic formula. 
| exact 6 | 
| Table of 6(M, a, K, n), a = 0, K = 1 for evaluating the root mean square error for zero-lag (a = 0) estimation of the first derivative 
of the input (K = 1) as a function of the degree of the curve fitting polynomial (mn) and the number of data points (JZ). 


M\n — 1 2 3 4 5 
5 1.0 
2 R 22.5 
5 ORO 7a 2.5495 
3 R i}. Of 4.59 
6 0.44721 1.5652 3.83695 
4 R =. Il 10.7 12.8 
6 0.31623 1.1148 2.52566 5.5839 
5 R —2.02 INI) 22.7 10.9 
6 0.23905 0.85252 1.90348 3.6802 8.2418 
6 R —1.40 10.6 23.8 28.1 0.0944 
6 0.18898 0.68138 1.52189 2.8226 5.2190 
a R —1.03 9.80 22.9 32.5 25.4 
6 0.15430 0.56167 1.26041 2.3028 3.9614 
8 R —0.784 9.03 21.5 33.0 35.3 
6 0.12910 0.47377 1.06943 1.9445 3.2351 
9 R —0.619 8.32 20.0 31.9 38.8 
6 0.11010 0.40685 0.92389 1.6794 2.7476 
10 R —0.501 7.70 18.6 30.5 39.5 
6 0.038778 0.14855 0.35079 0.65608 1.0684 
20 R —0.125 4.29 10.4 18.1 26.9 
6 0.0097999 0.038494 0.093871 0.18201 0.30714 
50 R —0.020 1.81 4.37 7.66 Hak? 
6 0.0037414 0.014821 0.036558 0.071883 0.12326 
95 R —0.00553 0.970 2.34 4.09 6.23 
6 0.000109380 0.000437 11 0.0010913 0.0021790 0.0038054 
1001 R OF0Faa LO 0.0937 0.225 0.394 0.602 


| 
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TABLE III 


Table of RUM, a, K, n) fixed a for evaluating the relative errors using the asymptotic root mean square formula compared with zero-lag 

(« = 0) estimation of the second derivative of the input (K = 2) as a function of the degree of the curve fitting polynomials (7) and the 
number of data points (17). 

R 


| asymptotic 6 — exact 6] X 100 = per cent error in 6 using asymptotic formula. 
exact 6 

Table of 6(M, a, K,n)a = 0, K = 2 for evaluating the root mean square error for zero-lag (a = 0) estimation of the second derivative 
of the input (K = 2) as a function of the degree of the curve fitting polynomial (n) and the number of data points (17). 


M\n > 2 3 4 5 
6 2.4495 
3 R —29.7 
5 1.0 6.7823 
4 R —16.1 —25.8 
6 0.53452 3.2071 14.017 
5 R —10.2 —10.2 —28.1 
6 0.32733 1.8919 7.0312 26.312 
6 R —7.04 —3.50 —9.12 —35.2 
6 0.21822 1.2440 4.3582 13.334 
3 R —5.15 —0.175 —0.268 —13.1 
5 0. 15430 0.87535 2.9852 8.4113 
8 R —3.93 +1.60 0.428 —1.31 
6 0.11396 0.64578 2.1729 5.8714 
9 R —3.10 2.60 0.672 5.32 
6 0.087039 0.49355 1.6494 4.3527 
10 R —2.51 3.15 8.03 9.17 
6 0.015094 0.087123 0.29179 0.74414 
20 R —0.626 3.30 7.95 12.9 
6 0 .0015194 0.0089549 0.030657 0.079685 
50 R —0.100 1.70 3.97 6.67 
6 0 .00030512 0.0018129 0 .0062672 0.016468 
95 R —0.0277 0.957 2.21 3.73 
6 0.84641 & 107° 0.50735 & 10% Ne X< AIO 0.47264 * 1074 
1001 R ORO hee 0. 0.0972 0.191 0.285 
obtains interpolation of the input polynomial. A more £&(u) = 1 
detailed discussion of of as a function of a is available.” 
; ; : ,  M+1 
Appendix I contains a summary of a few useful proper- &(w) = (u — @),% = gaa 
ties of the orthogonal polynomials, while Appendix II 
contains a discussion of (22). yf = Ee 
& (u) ra (u u) 1 9 
CONCLUSION 
An exact equation for the mean square error of the & (uw) = (u — w@)* — (u — w) [2a =] 
: rae 20 
output of an optimum digital filter has been presented. (23) 
The formula was derived using curve fitting concepts to pea ee ay ° a? 3M* — 13 
demonstrate the relationship between the concepts of “>? — : 14 
arameter estimation in curve fitting and weighting ; 
cae ee ea! ae Bae 3(M? — 1)(M? — 9) 
unction optimization in linear filtering. a - 
E ees : igh 560 
q. (11) represents a convenient formula for computing 
the weighting sequence of the digital filter. : 5(M* — 7 
Sram : é5(u) = (uw — a)” — (u — @)* see) 
From Tables I-III one may determine both the 18 
relative error associated with using the asymptotic % 
ere . vere ,[ 15M* — 230M? + 407 
formula for 6 and the exact value of 6 for small M. + (u — tw) 1008 é 


In Figs. 1-3 the values of 6 can be determined for those 
values of MW not tabulated. For 17 > 100 one can either 
extrapolate linearly on log-log paper or use the asymptotic 
formula. 


APPENDIX [| 


A listing of the orthogonal polynomials in consistent 
notation is as follows:* 


4R. L, Anderson and E. E. Houseman, ‘‘Tables of Orthogonal 


The functions satisfy the following recursion relationship 


v (M* — v’) 
4(4v” — 1) 


Eai(u) = Ewe) — E(u). (24) 


As indicated the recursion is an identity so that by re- 
peated differentiation one obtains 


E(u) = E (We? (u) + LE? (u) 


Polynomials Values Extended to N = 104,” Iowa State College of v°( Mm — ) 
Agriculture and Mechanical Arts, Ames, Iowa, Res. Bull. 297; — 5 g ED (y) (25) 
April, 1942. 4(40 ro 1) v-1l ) 


a 


fig. 1—Output of the digital filter is She aie squares estimate of 
the input with zero lag. 

| o,4 = rms error output (zero-lag estimation). 

oy = rms error of noise. 


| 6 = eS n = degree of polynomial passed without 
On error. 
(M — 1)T = memory span of filter. 
where 
Pee 0. 0s <0, and Iet<cy 2 -= Lt 
so that 
| 2 M 
HOM + a) = ee +s |; OM + a) 
en len a) 

2 Me 

Ae ERO +4). 6) 
: inally the sum of squares S (L, /) is given by’ 
| +L 

Gy La =) 

oy Tey (27) 


QD) 2L + D! 


igher order polynomials to degree 10 are listed by Allen.” 
very complete table of the values of €,(u) for v = 0, to 5, 
4 from v + 2 to 104 is made available by Anderson and 
ouseman.* 


APPENDIX II 
Discussion oF Asymprotic FORMULA 


The asymptotic relation obtained by Johnson’ is given 
IY 


5. E. Allen, ‘““The general form of the orthogonal polynomials 
or simple series with proofs of their simple properties,’ Proc. Roy. 
Soc., Edinburgh, vol. 50, pp. 310-320; 1935. 
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Ms “mumber of Anput dete points 


Fig. ee of the digital filter is the least squares, zero-lag, 
estimation of the first derivative of the input. 


7, = rms of output. 
oy = rms of input. 
Deh: n = degree of polynomial passed by filter with- 
Oi 6 out error. 
Ny T’ = interval between samples. 
(MW — 1)T = finite memory of filter. 
i (K +n-+ 1)! 


De WR in 


Eq. (22) presented in this report was obtained by using 
the curve fitting concepts and the orthogonal polynomials 
in the factorial form as presented in another paper.” The 
equations are asymptotically identical since the factor 
(MW 1) — M for large M, and the remainder of the 


6W. EH. Milne, “Numerical Calculus,” Princeton Univ. Press, 
Princeton, N. J.; 1949. 
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M = number of input data points 


Fig. 3—Output of the digital filter is the least squares, zero-lag (a = 0) estimation of the second derivative of the input where 


oT, = 
o => 
fad be 7s = 

Op error. 
on T= 
(M —-1)T = 


equation give rise to the same coefficient. There are two 
properties of (22), however, which are apparent in the 
derivation. 

First when the n < K, e.g., the degree of the curve 
fitting polynomial is less than the order of the derivative, 

= (. To see this more clearly, consider a filter designed 
to pass linear functions (n = 1) and let the input be a 
constant or linear. Suppose the zero-lag estimate of the 


rms of the output. 
rms of the noise. ; 
degree of polynomial passed without 


interval between samples. 
finite memory of filter. 


second derivative (K = 2) is the desired output. Thus, 
for any linear or constant input, the output should be 
zero. This result is assured if the weighting coefficients 
are all zero. The mean square error associated with the 
weighting coefficients is also zero, which is certainly the 
minimum. This result is implied if (1 — K)! is defined as 
infinity for n < K but is not stated explicitly. 

Second, (22) was obtained by expressing (17), as 6° is 


h terms of the leading coefficient of the highest power 
Xf M (in both numerator and denominator) for each 
erm. The factorial form of the orthogonal polynomials 
Milne® leads directly to (22). As long as a is not a 

nection of M, note that (22) is independent of a for 
Mr > a. Thus to the degree of approximation represented 
»y this form of the asymptotic formula, 6 is independent 
bf w. Eq. (22) with the factor M7 Gerb/2 was obtained 
lirectly by evaluating &("), ¢.g., (a = 0), and thus R of 
ables I-III approaches zero as M — faster than when 
he factor (M — 1)°*»” is used in the asymptotic 
‘ormula. 


= 


AppENDIx III 


“Brest” DERIVATIVE FILTERS 


In this Appendix the correspondence between the 
tlerivative of the least squares curve fit and the optimum 
derivative filter of another paper’ will be shown. It is 
required to modify certain of those parameters to corre- 
spond to the notation of this paper. 

The total number of samples is given by 


M=m+1. (28) 
| Define P,(jAt) of Blum’s? (4) as 
Pw) = ae nae 
| Pen PAL Tie On 12 tu init) 128) 
_ For the input model being considered, Blum’s’ (44) 
reduces to 
| |Wl=P'| QI, (30) 
Lines the matrix V = oJ (J is an identity matrix for 
uncorrelated noise), the matrix 6 = 0 for M(t) = 0, and 


the matrix [PP’]|=I because of the orthonormality of P(w). 
(See Blum’ and Fischer and Yates’). 
The matrix P’ is given by (25) of Blum’ and becomes 


| P.(M) PM) P,(M) 
Pp = Bo a 1) PM << 1) P,(M — 1) (31) 
P,(1) P,(1) TER (al) 
The matrix Q is given by (13) and (38) of Blum,” 
ie _ § (M+ a) 9 
Coe peer SIL Me 


The intent in defining a is that a = 0 represents a zero- 
lag estimate. The most current data point is measured 
at t = M, as opposed to t = mAt in the previous notation.” 
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On expending (30) for the vth weighting coefficient W, 
one obtains 


Wa » P,(M — »)Q,(M + a), 


0 = 07 We ee ey nl (33) 
Substituting (29) and (32) one has 
= E(M_— v)tr (M + a) 

= ; 34 
us pe, S(L, M) ee 

Finally letv = M —.u, u = I, 2, ---, M, then 
Ex(wér (M+ @) : 
Wu. = 3 S( ibe M) ) (35) 


which is identical to (11). 
DERIVATIVE ESTIMATES witH NONUNITY 
SAMPLING INTERVALS 


The effect of 7’ ¥ 1 will be considered. The polynomials 
listed in Appendix I are nondimensional, thus when 
T ~ 1 one must substitute 


bee ly 
- [SF 


An example of effect of taking derivatives is given as 
follows 


Let u = vT, then 


dé(VT, 9 et 1 7 (u, | 1), 
du 2 

and 
dé, rae le d’é,(u, 1) 
du’ OE) re Ta 

By similar methods one can show that 
dé, (oT, ite) ee il dé (u, Li). 

du Tee thie 


From (11) one sees that W,,,_,, is a linear combination of 
terms each of which are the Ath derivative with respect 
to u of é,(v T, T)\u-cu+a)7 80 that T* factors out of the 
expression for W,,,_,,.7 as stated in (16). 


A) 


bo 


oo 
bo 
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The Distribution of the Number of Crossings of a 
Gaussian Stochastic Process’ 


CARL W. HELSTROMT 


Summary—It is shown how filtered Gaussian noise having a 
power spectrum which is a rational function of the square of the 
frequency can be represented as one component of a multidimen- 
sional Markov process. Methods are studied for obtaining the dis- 
tribution of the number of times such a noise process crosses a 
given amplitude level in a fixed time interval. The generating 
function of this distribution is the solution of a Fokker-Planck 
type differential equation with appropriate boundary conditions. 
Integral equations are found for the generating function from 
which all the moments of the distribution can be calculated by 
iteration, 


I. InrRoDUCTION 


PROBLEM of interest in the theory of stochastic 

processes is to find the probability p,(¢) that a 

random variable x(t) crosses a given level, say 
x = a, n times in an interval of length ¢. For instance, 
this distribution determines the response to random 
noise of a device which measures frequency by counting 
the number of zero-crossings of its input. Also, the false- 
alarm probability for a gated radar detection system is 
1 — p(t), where po(t) is the probability that a given 
triggering level is not crossed by the noise in the gating 
interval of length ¢. The quantity (— o0p,/dt) is the 
probability density function of the first time a noise 
wave crosses the level « = a, given certain initial con- 
ditions at t = 0. The problem of finding the latter has 
been solved for a process representing the Brownian 
motion of a free particle’ and for one-dimensional Markov 
processes.” On the other hand, Rice’® has given a formula 
for the mean number 7 of crossings in a given interval, 
and formulas for the variance of this number have been 
given by Steinberg et al.* and by Miller and Freund.’ 

In this paper we shall discuss the problem of finding 
the distribution p,(¢) in the case of a stationary Gaussian 
random process x(t) which is one component of a multi- 
dimensional Markov process. In the next section we shall 
show how any such Gaussian process having a power 
spectrum which is a rational function of w, # = (angular) 
frequency = 2zf, can be represented as one component 
of an n-dimensional Markov process, where w™” is the 

* Manuscript received by the PGIT, March 25, 1957. 

t Westinghouse Res. Labs., Pittsburgh 35, Pa. 

1A. Blanc-Lapierre and R. Fortet, “Théorie des Fonctions Aléa- 
toires,’’ Masson et Cie, Paris, France; 1953. 

2A. J. F. Siegert, “On the first passage time probability function,” 
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the earlier literature are given here and in Blanc-Lapierre and 
Fortet, loc. cit. 

3S. O. Rice, ‘“The mathematical analysis of random noise,” Bell 
Sys. Tech. J., vol. 24, pp. 46-156; January, 1945. 

4H. Steinberg, P. M. Schultheiss, C. A. Wogrin, and F. Zweig, 
“Short-time frequency measurement of narrow-band random signals 
by means of a zero counting process,” J. Appl. Phys., vol. 26, 
pp. 195-201; February, 1955. 

5]. Miller and J. E. Freund, ‘Some results on the analysis of 


random signals by means of a cut-counting process,” J. Appl. Phys., 
vol, 27, pp. 1290-1293; November, 1956. 


highest power of w in the denominator of the power 
spectrum. The noise at the output of a linear filter made 
up of lumped circuit parameters is of this type when the 
input is white, Gaussian noise. In the last section we shall 
indicate that the generating function of the distribution 
p,(t) is the solution of a Fokker-Planck type differential 
equation with appropriate boundary conditions. Integral 
equations are obtained for the generating function which | 
permit the determination of the moments of the dis- 
tribution. Problems of this type can be attacked by the 
powerful general methods of Darling and Siegert,’ of 
which our approach of Section III might be termed a_ 
discrete analog. However, our method seems to be more 
direct and more easily understandable in this particular | 
case. Further details of our study of this problem are. 
given in a research report.’ 


Il. Tue Gausstan Markov Process 


The principal example of the type of stochastic process 
to which the theory of the next section is applicable is 
that of filtered Gaussian noise. The random variable «(¢) 
is assumed for convenience to have zero mean, and its 
power spectrum is a rational function of w’; 7.e., it can 
be written 


D(a) — N@a)N (ie) | Mea : 


P(iw)P(—iw) ~~ | P(iw) (1) 
where N(z) and P(z) are polynomials in z with real 
coefficients: 


N(g) = Now” + Ny” > + :-- Nm 
P@) =2 + pe’ + ---> Dp, (2) 
Ts S&P 


The decomposition (1) is made so that the zeros of N(z) 
and P(z) have negative real parts. Such a process can be 
generated by passing white Gaussian noise y(¢) through 
a linear filter of admittance Y(w) = N(tw)/P(iw); 7.e., 
x(t) satisfies the differential equation 


P(D)x(t) = N(D)y), D = d/dt. (3) 


We wish to find (n — 1) other functions of t such that 
with x(t) they form the n components of a Gaussian 
Markov process. The many ways of doing this can be 
derived from each other by linear transformations, and 


®D. A. Darling and A. J. F. Siegert, “A systematic approach to 
a class of problems in the theory of noise and other random phenom- 
ena—Part I,’”’ IRE Trans., vol. IT-3, pp. 32-37; March, 1957. 

7C. W. Helstrom, ‘Level-Crossing Problems for Gaussian Sto- 
chastic Processes,’’ Westinghouse Res. Labs., Pittsburgh, Pa., Rep. 
8-1259-R5; March, 1957. 
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nly one is presented here. The proofs of the following 
esults are simple and are omitted for the sake of brevity; 
etails can be found in another report.” The components 
-(t) of the n-dimensional process are connected by the 
ifferential equations 


dz;/di-= 2,4, + C,y()), Oi 1 
Gteny/ at a DS DsUn—s =e Cay), 
s=1 
Lou—= H(t) (4) 
he constants C, can be found from the contour integrals 
1 2’ N(z) FS 
Cre ee spire (5) 


here C encloses all the poles of the integrand. To evaluate 
uch an integral the transformation z = 1/w is useful. 
Then it is easy to show that 


=a) 0<r<n—m—2(m <n -— 1) 
| ae = No; 

‘while the rest of the C, depend on the other coefficients 
of N(z) and P(z). Using (5) and eliminating the z, suc- 
‘cessively from (4) one can show that these n-simultaneous 


‘first-order differential equations are equivalent to (3). 
‘Note that 


(6) 


| et) =a at”. 0<r<n-m-l1 (7) 
so that the first (x — m) components contain 2(¢) and its 
‘first (1 — m — 1) derivatives. The process x(t) is said 


‘to be differentiable at most (n — m — 1) times. 
In terms of the past history of x(t) the values of all the 
x,(t) at time ¢ can be found by means of the operators 


@-(D): 
a(t) = 0,(D)x(),  D = d/dt. (8) 


These operators are rational functions of D and can be 
found from the contour integral 
P(p) 2'N(@) 
0,(p) = 
®) = 27iN@) Je @ — )PO 
hess C encloses all poles of the integrand except that at 
= p. In particular 


dz, (9) 


OL) =p; 0<r<n-m-l, 
p” "N(p) — NoP(p) 
0 a a 7 10 
(p) N@ (10) 
In this operational notation the operator (D — yu)’ 
defined as usual by 
(Dw =e f ewseas, (1) 


where the real part of u is negative. [V(D)] * is written 
as a product of such factors when it appears in the 0,(D). 
The latter can always be evaluated in such a way that no 
more than (n — m — 1) differentiations of a(t) are re- 
quired. [Special consideration must be given to the case 
in which N(z) vanishes for z = 0.'] 
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That the n functions z,(¢) together form the components 
of a Gaussian Markov process is related to the fact that 
they are connected with each other and with y(t) through 
the differential equations (4) of first order. If y(t) is 
Gaussian, it can be shown that the joint conditional 
probability density function 


to), {ar} = {2x-(to)} 


satisfies a Fokker-Planck ee of the form 


2 aa 
r=0 


Abr 


P(x, tx’, x= {z()}, = 


n—1 


r 


ue 


=n—m—1 s=n—m-—1 


Ge 

63 Sn in i A 

he ae 
r=0 


= Lp — dp/dt 


(12) 


where ZL is a differential operator on the space coordinates 
(Xo, V1, *°* 2,1). The coefficients in this equation were 
derived from (4) by the method discussed by Wang and 
Uhlenbeck.* By another method given there one can 
show that the solution of (12) which vanishes at infinity 
in all spatial directions, which is zero for t < t), and which 
approaches a product of delta functions as ¢ approaches 
t) from above: 


n—1 
lim p(x, ix’, %) = [] 6@,—2), t>%, (13) 
t>to r=0 
is a Gaussian function of the form 
P(X t|X jit) = (2m iy eCeurO.ss ieee 
il n—1 x 3 
-exp = oe u,(%, — &,)(t, — +) |, (14) 


where @, = x,(t) = Ef{z,, t | x°, to} is the mean value of 
x, at time t when the components had the given set of 
initial values 2, = 2%,(t,), 0 <7, 1s < te le Be scx 
pectation value. The matrix || u,, || is the inverse of the 
cross-correlation matrix || ¢,, || of x,(¢) and 2,(t): 


¢,(t) = Ef(a, to}. (15) 


The x,(t) and ¢,,(£) can be computed in terms of the 
spectrum (1) in a straightforward manner’ by use of (8). 
The solution (14) is the basic transition probability 
density function for the n-dimensional Markov process. 

As a function of the initial values x° = {x2} the tran- 
sition probability (14) satisfies the adjoint Fokker- 
Planck equation” 


Nee ca. ia Hie). i|x’, 


n= (=o 


Rope op 
ONCE Ox ax°dx° a 2 Tae Ox? 


z sp o 
- ( pitt.) at + = = [yp + dp/dto 
8 An-1 0) 


= el 6(x, a a) 6(t a to), (16) 


where L, is the differential operator adjoint to L; the 
subscript denotes that it operates on the {a3}. 
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The Fokker-Planck equation (12) has the form of a 
conservation equation if it is written in terms of the 
n-dimensional divergence: 


div J + dp/at = 0, Pee (17) 

where the ‘‘current” J = {J,} is 
| n—t 
Je = @sip — 5 C: > Cyop/ox., 
s=n—m-l1 
Op < ja 1 

n 1 t—-% 
dBase = 9 SS DsUn—s = 2 Cre Ce Op /dx,. (18) 

s=1 s=n—m-1 


Note that if x(¢) is differentiable at least once, m <n — 1, 
by (6) C, = 0 and Jy = 2,p = xp, where the “‘velocity”’ 
& = dx/dt. Then for the current in the x direction we 
have only a transport term due to the velocity « and no 
gradient terms. Gradient terms disappear in all J, for 
Ors ron =] mS] 2. 

It is convenient for purposes of visualization to speak 
of these equations as though they described the diffusion 
of particles having certain initial positions, velocities, 
etc., and we shall often do this in the sequel. This may be 
called the ‘‘Eulerian’’ approach, as contrasted to the 
“Lagrangian”? method which considers the history of an 
individual particle. The results are applicable to other 
situations, of course, where a(t) may be for instance a 
noise voltage. A thorough treatment of the two- 
dimensional case has been given by Wang and Uhlenbeck,* 
and they discuss the general problem of the thermal noise 
in a circuit containing N meshes of resistors, inductors, 
and capacitors, showing that this noise can be represented 
in terms of a Markov process of 2N dimensions. 


Ill. Tom DistrRiBuTION OF THE NUMBER OF CROSSINGS 


It is assumed that the random process a(t) can be 
represented as one component of a Gaussian Markov 
process in n dimensions, n > 1, in the manner discussed 
above. It is further assumed that x(t) 1s differentiable at 
least once (m < n — 1), so that we can define a velocity 
z(t) x, (t) dx/dt as the second component of the 
Markoy process. Let p,(t) = p,(t | x°) be the probability 
that the random variable x(t) starting at the initial value 
x” = (%; Lo, Lo, La0,.2* tno) att = 0 crosses the level 
x = an times in the interval 0 < t’ < ¢. In what follows 
we write this p,(¢ | x, >) and omit the dependence on the 
variables z,o = 2,(0), 2 < r < n —1, but the presence 
of this dependence must be kept in mind. In expressions 
involving integrals, integrations are also taken over the 
entire ranges of these (n — 2) unwritten variables. We 
shall find equations for the generating function of this 
distribution: 


h(t|to, £03 2) = 2"pn(t|to, Zo). (19) 


8 M. C. Wang and G. FE. Uhlenbeck, “‘On the theory of the Brown- 
ian motion IJ,” Rev. Mod. Phys., vol. 17, pp. 323-342; April, July, 
1945, 
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If 1) < a we denote this by h(t | 20, %o; 2), 1f % > a byw 
h*(t | vo, &o; 2). If some distribution of initial values at 
t = Ois given, A(t | xo, #; 2) can be multiplied by it at the’ 
end and the product integrated to determine the generating 
function for the given problem. Thus we are dealing here 
with the generating function of a conditional distribution. 
The quantity po(t | Xo, %) = A(t | Xo, 4; 0) is the prob=) 
ability that x(¢) does not cross x a in the interval 
0 < t’ < ¢t. For the Markov processes they considered, | 
it has been pointed out by Siegert” and Wasow’ that 
Po(t | 2, #) satisfies a partial differential equation in- 
volving the adjoint Fokker-Planck operator LZ of (16): 


(20), 


Lopo(t|xo, 0) — Op./dt = 0, 40: 


For the n-dimensional process of Section II, the boundary | 
conditions on this equation are,’ for a < a, 


0, 
I, 


Xo > 0 


I 


po(tla, Zo) 


Do Ol torte Ua a i (21). 


Further we can define a function G_(x, %, t | %, Xo, to) | 
as the joint density function of x, @, @, +--+ %,—, at time - 
t, given that the trajectories started at 2, %, etc. at time 
ty with x < a, and remained in the region « < a for the 
entire interval t, < t’ < t. Then | 

paltlto, t) = | dv | diG Ce, &, tlt, to, 0). 22) 
The function G_(x, #, ¢ | %o, 20, to) satisfies a Fokker- 
Planck equation of the form (12) as a function of the 
final coordinates x, ¢, --+ t, with the additional boundary 
condition® that it vanishes at x = a for ¢ < 0. Asa fune- 
tion of the initial coordinates 2%, %o, - ty it satisfies 
(16) with the additional boundary condition’ that it 
vanishes at 2p a for z > 0. A similar function 
G(x, &, t | Xo, £0, to) can be defined for trajectories which 
remain in the region x > a for the whole interval t, < t’ < t. 

Referring to (19) we consider a random process in which 
the trajectories a(t) are the same as for free diffusion; that 
is, their development is described by the Fokker-Planck 
equation (12). Each time a trajectory (or ‘diffusing 
particle”) crosses « = a, there are probabilities z < 1 
that it will survive and (1 — z) that it will be removed 
from the system; but if it survives it continues on its 
way with no change in velocity or in any other coordi- 
nate. Then h(t | 2, %; 2) is the fraction of such particles 
remaining at time ¢, out of all those starting at (a, %, ---) 
at time t = 0. 

It is plausible A(t | x, %; z) is a solution of an adjoint 
Fokker-Planck equation of the same type as (20), and a 
detailed consideration shows this is so.’ That is, 


Loh(t|ao, 4032) — dh/dt = 0, ab) (23) 


To find the boundary conditions, consider one particle 
starting just below «x a with positive velocity, and 


°W. Wasow, “On the duration of random walks,’’ Ann. Math. 
Stat., vol. 22, pp. 199-216; July, 1951. 


nother starting just above x = a with the same initial 
onditions. The former will have one more crossing in 
< t’/ <t than the latter. Denoting the respective prob- 
bilities by the superscripts — and +, 


pt) = p,_,(), es 


pot) = 0. (24) 
ence from (19) 
v(t; 2) = De 2'p ad) = do e'p.i() 
=e 2). Loe 0. (25) 


sing a similar argument for 4) < 0 we get the boundary 
onditions 


h (tla, Lo; z) zh" (tla, Lo; 2); Xo a 0 


EGov) = eh 1G, 5° 2) Lorx 0 (26) 


Where h (t| a, %o; 2) is the limit of h(t | x, &; 2) asa a 
From below, while in h*(t\a, #; 2) % —> a from above. 
e also have the condition 


h(O|xo, &o;3 2) = 1, (27) 


Rince p,(0 | %, %o) = 1, p,(0 | ro, %) = 0, > 0. Here we 
Ihave a discrete analog of the method of Darling and 
Siegert;° a crossing of the boundary x = a corresponds 
ito a “collision” in the language of their article. 

| We now write down integral equations for the generat- 
ing function which can be obtained’ by applying Green’s 
theorem’ to (23). In the first set the kernels are the 
functions G_ and G, defined above: 


h(t|to, 4052) = h(t|xo, £0; 0) 


+z] at, | a (b= tila, 252) 
0 0 


Genie aworcba 0) das Lo <a), (28) 
ih (t| 20, %o3 2) = h*(t|xo, £o; 0) 
t 0 
+z] at, [ (ete eG eta, te) 
0 —o 
-G.(a, oe Caeen Avie 0) ike Xo > Ws (29) 


In the application of Green’s theorem, one obtains a term 
depending on the component of the current J normal to 
the surface x = a, with an integration over that surface 
and over the time interval 0 < t’ < ¢. Because w(t) is 
differentiable, m < n — 1, the component of J in the 
x-direction consists only of a transport term and no 
gradient terms [see (18) et seq.], and one obtains the 
second term on the right of (28) and (29), after using 
(26). It is because a velocity is definable for these 
differentiable processes x(t) that the integral equations 
possess a comparatively simple form. 

The meaning of (28) is clear: the first term on the right 


10 P. M. Morse and H. Feshbach, ‘“‘Methods of Theoretical Phys- 
ics,’ McGraw-Hill Book Co., Inc., New York, N. Y.; 1953. See Sec- 
tions 7.1 and 7.5. 
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is the number of particles at time ¢ that have never 
crossed « = a, while the second term classifies the re- 
maining particles according to the first time they crossed 
x = a with positive velocity. On this crossing only a 
fraction z survived, and each of these crossing with 
velocity 4, > 0 in ¢, to t, + dt, had a probability 
h*(t — t,) | a, 4; 2) of lasting until time ¢. A similar ex- 
planation holds for (29). The distribution p,(¢ | ao, do) 
can be found by iteration of (28) and (29). For example, 
LOL Bp <a, 


peer / dt, i Aen oe 
(0) 0 
(res aes bilaos Lo, 0) di, 
t ti fo) 0 
p2(t) — / dt, i dt, i. 4 di, it | Ap, | di» 
0 0 0 —o 
h(t = tha, t23 OG IG, aie) 
‘-G_(a, Zi, ti |2o, Lo, 0), (30) 
results which can of course be written down from first 
principles. 


Since the density functions in (28) and (29) are usually 
not known, it is useful to find integral equations involving 
the known transition probability function (14). Appli- 
cation of Green’s theorem yields’ 


haCserteoee = / ax | OE DEE abo eon) 


tz ip dt, i Duly (Ge ty Onenee) 
“D(G, br, Gitex tas Oats 
=F i dt, if ite tl Os eae) | 
(31) 


pla, X1, Lite, ao 0) dias Xo ~G a 


O= / ax | i DE ,k, bean LonO) 


t foo} 
— / dt, it £,h*(t — t,|a, 2132) 
0 (0) 


“pla, Li, t,|2o, Lo; 0) dx, 


t 0 
at at, f Bhat —K tee serxe) 
0 —o 


“pa, 1, ty |", Zo, 0) dé, Ly <a (32) 


We Ctlheos ats 2) = 


if dx if CLD ELE Leon) 


il dt, | gh CE a, tla, Bee) 
0 0 


-p(a, AB a (BA th Xo, 0) dx, 


t 0 
z | at, | Die —slawtase) 


mints espn yl) Ohi. LaLa (33) 


Of | dx [ Cd: UG cde, Hens day) 


at 
dt, / 
70 JO 


fos) 


+2 Tl (be GQ, ys) 


“(Qn das tuto en Oy vany 


rt nO 


> | dt, | that = td, Gy > 2) 


(Oa da will Bone ta Ones to > Gs WACO’) 


For z = 1 these equations reduce to various forms of the 
conservation law (17), since 


(theo, 03 1) = 1; (35) 
in this case there is no loss of particles. The meaning of 
the equations is also clear when z = 0, h(t | a, 4; 0) = 
Po(t | Xo, #0); for instance (31) then shows that the number 
of particles in x < a at time ¢ (the first term on the right) 
is made up of those which have never crossed « = a (the 
term on the left) plus those which have crossed « = a at 
least once, these being classified according to the last 
time they crossed « = a (the negative of the third term 
on the right). For 0 < z < 1 the accounting of the numbers 
of particles is more complicated. 

The most symmetrical form of these equations is 
obtained by adding (31) to (32), and (83) to (84). Since 
the integral of p(x, &, t | %, &, 0) over all space coordinates 
2, £, +++ L,-, must be unity, we get 


t 
halite, ope =e — 1) / dt, 
J0 
/ pile ( t,|a, £15 2)pa, ti, b Ege to, 0) di, 
0 


Eye I dt, / Wate le eco 


“OG; Lis Gy |Coy oO) 0k, iy PO) (36) 


hie (leo on 2) el) if dt, 
0 


[ Dt — tla. Li. 2) Genre con Caw) cee 


t 0 
+(@- » | at, | leciell Wa (Cet loperre) 


DiG@etys tite, £5, aan Lo et) 
Other integral equations can be obtained’ in terms of a 
function H(z, &, t | Xo, #, 0; 2) which describes the density 
distribution of particles in our artificial absorption 
process. 

By (19) the moments of the distribution p,(¢ | x, 4) 
are related to the coefficients of a power-series expansion 
of h(t | 2, #o; z) about the point zg = 1. This expansion is 
obtained immediately by solving (36) and (37) by iteration: 
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Deon 


(oo) 


a CREE nil dpe if dt, : 
0 
| | #: | p(a, #1, ti|a0, &o, 0) da, / 


t to ee) foe) j 
+ lim (zg — 1)’ / dt, | at, [ ee dé, | dy ‘im 
JO —o 


«0 


pla = "€, 035 ta, Li, t:)p(a, Li, ty |o, Lo, 0) 


tlim@—0* fay fat [ \sel ate f 


€>0 
“pla aie €, £2, tla, £1, t,)pla, Li, ti |2o, Lo, 0) 


fm <a, €>0, Ga 


Thus the second term yields Rice’s result® for the mean 
number 7% of crossings in an interval of length ¢: 


Wb cone i at, [ emule Gaerne iileny ai (Wire. (39) 
0) —o 


In most problems the particles are initially distributed in| 
accordance with the equilibrium distribution 


Wa, x) = lim D(x, De oan Xo, cave 


Multiplying (39) by W(a, %) and integrating over the 
entire ranges of these variables, we obtain the usual 
formula for the average number of crossings: ; 

0) | | z | W(a, £) dz. (40) 
This formula has been applied by Middleton"’ to compute 
the false-alarm rate of a triggered circuit such as that in 
a radar detection system. 


From (19) and (38) we see that the mean-square of the 
number of crossings, n”, is given by 


n= ni +2 lim cin a | ty | dats 
0 0 —o 


€>0 


oO 
| Ly AL Pd —-€, 4a, te, Leet eo) 
0 


0 
+ if per | déypla + €, aye ges Obs Scat ibe, to, 0) | 


ey = ay Tie Se 


where we have integrated over the coordinates correspond- 
ing to x,(t), 2 <r <n — 1, expressing our result in terms 
of the joint conditional probability density function 
p(x, &, t; 24, #1, ty | %o, Zo, O) of v, # at time # and of xem 
at time ¢,, given the initial values x, 2 at time t = 0. 
Eq. (41) corresponds to the result of Steinberg et al,* 
and of Miller and Freund,’ although these authors use 
different methods of avoiding the singularity at t, = ts 
in the integrand. Moments of higher order can be obtained — 
by continuing the series solution (38). 

We remark in closing that these methods can be applied 


_ 41D. Middleton, “Spurious signals caused by noise in triggered 
circuits,” J. Appl. Phys., vol. 19, pp. 817-830; September, 1948. 


(1957 


to determine the distributions of the number of maxima 
or minima of a random function in a given interval. The 
relevant boundary in n-dimensional space is now the 
hyperplane ¢ = 0. If the process is twice differentiable 
(m < nm — 2), we can define an acceleration variable 
‘&(t) = x,(t) = d’x/dt’ which is also a component of the 
Markov process. Maxima then correspond to the tra- 
jectory in n-dimensional space crossing ¢ = 0 in & < 0, 
while minima correspond to crossing it in @ > 0. The 
methods of this section will yield, e.g., in place of (40), 
Rice’s expression® for the average number of maxima in 
a range xv to x + dx in an interval of length ¢. Such quanti- 


Summary—An important characteristic of coherent integrators 
is that their effective bandwidth decreases as the integration 
_ time increases. If it is only known that a weak signal occurs some- 
_ where in a given frequency range, then the number of integration 
channels required to cover the specified range increases as the 
' amount of coherent integration is increased. However, each inte- 
gration channel can independently cause a false alarm, although 
only the particular channel in which the signal appears can cause 
a true alarm. The question arises therefore whether it is profitable to 
lengthen the coherent integration period to increase the signal-to- 
noise ratio when doing so requires an increase in the number of 
integration channels. This problem is investigated analytically. 
Numerical results appropriate for system design are presented as 
a series of graphs of missed-signal probability vs number of inte- 
gration channels, with initial signal-to-noise ratio and over-all false 
alarm probability as parameters. 

Also included is a detailed analysis of statistical properties of 
ideal and approximate ideal coherent integrators. 


I. InrRoDUCTION 


‘ | ANY radar systems of current interest employ 

coherent integration for the detection and 
| identification of weak signals in Gaussian noise. 
In the case of coherent cw radars the echo, after coherent 
heterodyning, is a sine wave whose existence indicates the 
presence of a target and whose frequency reveals infor- 
“mation concerning the target. In the case of coherent 
pulse radars the echo, after coherent heterodyning, is a 


pulse train with a sinusoidal envelope. We shall assume 


| 
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ties as the variance of the total number of maxima and 
minima can be found by similarly extending the analysis. 
Unfortunately none of the methods of this section is 
simple for computation. To solve the Fokker-Planck 
differential equations would no doubt require a high-speed 
computing machine. The integral equations are made 
difficult to handle by the singularity of the kernels, and 
the Wiener-Rice series’ which result when they are 
solved by iteration contain cumbersome integrals even in 
terms of low order. To attain results of practical import- 
ance, approximations must be made. It is hoped that the 
theory outlined here will be of use in such an effort. 


An Analysis of Coherent Integration and Its Application 
to Signal Detection’ 


K. 8. MILLER} anp R. I. BERNSTEINY 


that the pulse envelope is converted to ew by subsequent 
processing or otherwise placed on a basis which allows it to 
be considered in the same manner as a continuous sine 
wave. A common method of converting the amplitude 
modulated pulse train to a continuous wave employs a 
pulse stretcher, or “‘boxcar,’”’ circuit followed by a filter 
which compensates for the pulse stretcher’s spectral 
characteristic. 

Properties of coherent integrators will be defined pre- 
cisely and investigated in Section II. The application here 
to signal detection will be the problem of devising a pro- 
cedure to test the hypothesis that a sinusoidal signal exists 
in the presence of Gaussian noise. (It is desirable also to 
identify the frequency of the signal.) Towards this end 
the enhancement of the signal-to-noise ratio achieved by 
coherent integration will be analyzed in Section III. 

The quality, or strength, of a statistical hypothesis 
test 1s specified by giving the false alarm and missed- 
signal probabilities. A falsé alarm, known in statistical 
terminology as a Type I error, occurs if it is concluded 
that a signal is present when no signal really exists. 
A missed signal, or Type II error, occurs when a signal 
really exists and it is concluded that none is present. The 
missed-signal probability is one minus the detection 
probability. The hypothesis test can be implemented by 
applying the combination of signal and noise to a threshold 
device which indicates whether a preset amplitude level 
is exceeded. If the threshold is exceeded, the hypothesis 
is accepted that the signal has occurred. A knowledge of 
the amplitude probability density functions of the noise 
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alone and of the signal plus noise is required in order to 
ascertain the false alarm and missed-signal probabilities 
that result from the choice of an acceptance threshold 
voltage. This problem of detection and its ramifications 
will be discussed in Section IV. 


Il. CoHERENT INTEGRATION 


In order to improve the quality of the signal detection 
test, we desire to improve the signal-to-noise ratio. It will 
be shown that this enhancement can be achieved by the 
use of coherent integration. This section will be devoted ac- 
cordingly to a discussion of coherent integrators. 

In Part A the ideal coherent integrator (ICI) will be 
defined and some of its features discussed. In Part B we 
shall investigate the statistical properties of inputs and 
outputs, and in Part C we shall compute the cumulative 
distribution of the envelopes of the outputs of a bank of 
n ICI. A device which performs almost as an ICI will be 
described in Part D. Finally, in part E we shall analyze 
some of the statistical properties of inputs and outputs of 
this physical device. 


A. The Ideal Coherent Integrator 


An ideal coherent integrator essentially integrates the 
quadrature components of the envelope of an applied 
sinusoid. We make this notion precise. Let e;(t) be a given 
signal. Then with respect to the frequency w,, e,(t) may 
be written as 


é,(t) = a,(t) cos wot + 6, (2) sin wot. (1) 


The above representation is not unique. For example, 
e,(t) could also be written as e,(t) = ¢,(t) cos wot + d,(t) 
Sin wot where ¢,(é) = a,(t) + b,(¢) tan wot and d,(t) = 0. 
However, there is only one expression of the form of (1) 
where [aj(t) + b7(t)]'’” is the envelope’ of the waveform 
e,(t) with respect to the frequency wo. It will be assumed 
in future work that (1) represents this unique expansion 
of e,(t) in terms of two amplitude-modulated sinusoids in 
quadrature at the frequency w». An ICI tuned to the fre- 
quency w, is defined as a device whose response e2(t) to 
the applied input e,(¢) is 


€o(t) = COS wot i a,(é) dE + sin wot if b,(€) dé. (2) 


To illustrate this point Fig. 1 shows the response of an 
ICI to a constant amplitude sine wave whose frequency is 
the same as the integrator’s tuned frequency. Thus, 
é,(t) = sin wot and e,(t) = ¢ sin wot. We shall call wo the 
“tuned frequency” of the coherent integrator. The inte- 
grator is “‘ideal”’ if the coefficients a,(t) and b,(t) of e2(t) are 
exactly the integrals of a,(¢t) and b,(t), respectively. In 
many practical cases these integrals are only approximated, 


1JIn general, if y = f(t, w) where w is a parameter, the envelope 
may be determined by eliminating w between the equations y — f = 0 
and df/dw = 0. 
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e,(t) 


' e,(t) 
IDEAL 
; | COHERENT |— ' 
INTEGRATOR 


Fig. 1—Output e(¢) of ideal coherent integrator with input e(¢) — 


a constant amplitude sinusoid at tuned frequency of integrator, 


Fig. 2—Phasor representation of coherent integrator response. 


and then the device used as a coherent integrator is not 
ideal. 

An interesting phasor interpretation of the response of 
an ideal coherent integrator to an input signal of frequency 
unequal to the tuned frequency of the ICI may be given. 
While this result is not explicitly used in this paper, it | 
has been found to be a useful tool in certain associated 
work. 

In order to develop this phasor representation, suppose 
that 


€,(t, Aw) = cos wy + Aw)t 


is a constant amplitude sinusoid at a frequency displaced 
by an amount Aw from the integrator’s tuned frequency. 
Then the input to the ICI will be 


eé,(t, Aw) = cos Awt cos wot — sin Awt sin wot. 


From (2) it follows that 
t t 
€,(&, Aw) = COS Wot / cos Awé dE — sin wot i} sin Awé dé 
(0) 0 
which may be written as 


. Awt 
Sih. === 


éa(t, Aw) = 


cos & + ast) (8) 


2 


2 


If Aw = 0, of course, e,(t, Aw) = e,(t, 0) = cos wot and 
the output of the ICI is e,(¢, 0) = t cos wot. 

Fig. 2 is a phasor diagram of these results portrayed 
on a set of axes which revolve at the frequency wo. The 
tip of the vector e,(t, 0) moves at a uniform rate along the 
horizontal axis and has a length t. The tip of vector 
eé,(t, Aw) moves at a constant angular rate on the cir- 
cumference of a circle with center at (0, 1/Aw) and radius 
1/Aw. Although the developed length of the trajectory 


| 
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of e,(t, Aw) is always the same as the length of e.(t, 0), 
the vector itself experiences periodic nulls. 

We wish to draw one further conclusion concerning this 
phasor representation. Let H(t, Aw) be the ratio of the 
phasors representing e,(¢, Aw) and e,(t, 0). That is, 


sinee 
SE SS: 
Aw 2 sin 22 
o(t, Aw) 2 2 | Awt 
E(t Beet, = = 
E(t, do) = 8A at Me 
2 


Among other things, this result shows that the effective 
bandwidth of an ICI decreases as the integration time ¢ 
increases. This relation is exploited in Section IV, where 
we discuss the multichannel problem. 

B. Statistical Properties of Ideal Coherent Integrators 


Suppose that the input to an ICI tuned to the frequency 
wo 1s Statistically stationary Gaussian noise, 2(t). We shall 
assume that z(t) has mean zero and correlation function 
¥(r). Rice” has shown that 7(¢) may be written as 


1(t) = ic(t) cos wot + 7s(é) sin wot 


where 7- and 7s are independent Gaussian variates. It 
follows immediately that 


(i) = (ic) = Us) = 0 
(where (- - -) indicates ensemble average) and 
@) = ic) = Us) = Yo 


iphere YW = v0). 
The output of the ICI is, by definition, 


1) =| [i sete) ae | cosent + | [is ae] sim at 


The stochastic integrals I¢(t) = fo tc(é)dé and Is(t) = 
fo ts(€)d— are also independent Gaussian variates. Their 
means are zero since 


Feld) = f ticl®) df = 0 


with a similar formula for (75). Of course, (J) is also zero. 
However, their variances are time dependent. To compute 
the variance of J(t) write 


| 
MP()) = (12() cos? wot + (12(d) sin’ wot 
+ 2T¢(t)Is(t)) Cos wot Sin wot. 


(rao) = ff Gol@riets)) ae ar 


and Rice’ has shown that 


2S. O. Rice, “Mathematical analysis of random noise,”’ Bell Sys. 
Tech. J., vol. 24, pp. 46-156; January, 1945. See p. 75. 
8 Ibid., p. 77. 
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Gc(E\ic($)) = ts (€)ts($)) 
= | wif) cos 2-16 - Dla) 
where 
wh) = 4 ik V(7) cos 2arfr dr 
is the power spectrum of the z(t) process. Thus 
Be rer sinat — fd | 
(1) = 30) = | w(p| Sart = fo eer pak 


Also, 


Gis@ie(S)) = f° wi) sin Qnllf — fN(¢ — D1 af 


from which it follows that 


elroy = ff s@icl®) at ae = 0 
and hence 


(I?(t)) = (é(t)) cos’ wot + (13(2)) sin? wot 


is equal to the common value of (/¢(t)) and (I§(é)) as 
given by (6). For large values of t, (J’(é)) is asymptotic 
to t-w(fo). . 

It is also of interest to compute the correlation function 
of the outputs of two ICI of different tuned frequencies. 
Suppose, then that 7(¢) is a stationary Gaussian process 
of mean zero and correlation function (7). The input 
to an ICI of resonant frequency w, is 


u(t) = tc(t) cos w,t + z(t) sin a,¢, 
and the output is 


I) = i rac) ae COS w,t + i 7s (€) a fin w,t. 


Similarly, the input to an ICI of frequency w, may be 
written 


a(t) = 76(t) cos w,t + 7£(2) sin @,t 
with output 


110 =| [i4@ de] cost +| f° ise ae] sin ot 


Following Rice it is easy to see that 


= [ “ w(f) cos nlf — $e = F — WE a 
and 


(isis) = GoOiK) 
= [wf sin 2nllg — 1E- = HEN a 


where w(f) is the (stationary) power spectrum of 7(t). 
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We define the cross correlation of J,(t) and J,(é) (at the 
same instant) as 
W,.(t) = (I,()I.(d)). 


Then it follows as before that 


cos rt(f, — f,) 


pe sine — | n 
I wth merit rere. 


which vanishes for values of ¢ which make zt(f, — f,) 
an odd multiple of 7/2 with no restrictions on the form 
of the spectral density of z(t). 


v4) = 


C. Cumulative Distribution of the Envelope of ICI 


If we have a bank of n ICI tuned to the frequencies 
@;, ***, Way We wish to compute the cumulative distri- 
bution of the envelopes of the outputs. Thus, representing 
the output of the rth ICI as 


I,() = I¢,(t) cos w,t + Is,(é) sin o,f, 


Li tad W275 CI a) (7) 
the envelope is defined as 
R,@) = VIE@ #150. (8) 
If the joint fr.f. of the R’s is denoted by p(R,, ---, R,), 
then the cumulative distribution C is 
= / ‘ / phk,, hes eit, dk, : OR (9) 
0 0 


Suppose then that the input to each ici is statistically 
stationary Gaussian noise 7(¢) with mean zero, corre- 
lation function y¥(7), and power spectrum w(f). Thus 
To1, I81, Ic2, Ls2, °° +, Lon, Ls, have a joint 2n-dimensional 
Gaussian distribution. As in Part B, it is seen that 


We) = Me, Leg) = Al sels) 


=f faa [wep cos 2nlt - foe - F - foal af 


= (10) 


cos xi(f. = 1) [ wiFY — 1PU - fd af 


and 


®, (t) = Lert ss) = (isle) 


- ff fvea [ow 


= sin rl(f, — 4) fw 


w(f) sin 2a [(f — f.)E — GF — f2)5] af 


wo NRG — f)FG — f.) af (11) 


where 
sin mtx 
Te 


Ee)-= (12) 


The correlation matrix of the 2n-dimensional Gaussian 
distribution 1s thus 
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VY, O Wie Pio Wi, Pin 
Wa SG Was — Dp 
Wy, Po; 5 eee 8) WG CBee 
Me eae Oats ~ 2, Ya || om 
Wei Pai Wie Pro Ween) 
= WG Ey OF aN. 
with respect to the variables I¢,, Is1, --°,; Len; Isn mm 
that order. Note that V,, = V,, and ®,, = — ®,,. Calling : 
qe., +++, [s,) the joint 2n-dimensional distribution, and — 
letting 
Ic, = RB, cos 6,, Ts, = RK sin 0), = 2 
the joint fr.f. of the R’s and @’s is 
{lin Oop eee ee (Fey tou - ee Les) : 
‘qf ei, disse lees ann 


where J is the Jacobian of the transformation from the ) 
I¢, Is variables to the R and 6 variables. The fr.f. of the © 


R’s is thus given by the marginal distribution 


p(R,, Bi 9S) R,,) 


20 24 
Hl Fi / f(R,, 6, beet see’) 
0 0 


which allows us to calculate the cumulative distribution 
C*oir@)s 

In order to compute p(R,, ---, &,), we resort to the 
following technique. The fr.f p(R,, ---, R,) depends on 
the V,, and the ®;; as can be seen from (13). Therefore, 
consider p(R,, ---, &,) as depending on the parameters 
G,, where 


Ein nO) CO pciacad Os 


,,(f) = sin if, — f,)G,.,  W,.(t) = cos wtf, — f,)G,3; 


that is, 


[ word = tora - 4 at, 


SE we eRe i 7, (16) 


Because G,, = G,, only the cases s > r need be considered. 
Thus, we may write 
phi, cA et) = G(R, G) 


where R is the n tuple #,, ---, R, and G is the 4n(n — 1) 


dimensional parameter G,,, s > r} s,r = 1, ---, n. Now 
expand g(R, G) in a multiple power series, 
G(R, G) = = g(R, 0) “S Lae aoe G(R, 0)G,. 
1S OF &gR, 0) 
i 2! 2, De 0G, 0G; CG By cs ( 
s>r hese 


. 
| 


(15) 
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This is a valid representation for g(R, G) within the 

radius of convergence of the power series. 

The value of the above representation is the following. 

f G = 0, then the correlation matrix M of (13) is purely 
iagonal. Hence g(R, 0) is simply the product of the one- 

dimensional marginal distributions: 


g(R, 0) a pi(R,)po(R») iad 4 Die 
‘The marginal distributions p,;(R;) are Rayleigh: 


R; EERE 
We 


p(R;) = (18) 
Now to compute dg (R, 0)/0G,,, first set every G;; 
‘except G,, equal to zero, differentiate with respect to 
G,,, and then set zt equal to zero. If we let g(R, G,.) 
represent g(R, G) with every G;,; except G,, set equal to 
zero then 


g(R, G,.) = p(R,, R,)I,. (19) 
where II,, is the product of the a priori distributions 
p.(R,), ---, Dr(R,) except p,(R,) and p,(R,). Thus, the 
problem is reduced to that of calculating p(R,, R,), and 


a 0 
3G. g(R, 0) = (2 ph, R)) Hive: (20) 


Grs=0 

| The computation of 0°9(R, 0)/dG,,0G;; is somewhat 
more difficult. If the pair (r, s) is the same as (2, 7), the 
formulation of (19) may be used to compute 0’g(R,0)/dG?,. 
‘It (r, s) is distinct from (7, 7), then 


|-°g(R, 0) _ ( a 
10G,, 0G,;  \dG,. p(R., B.) G=0 
rs) 
" (3 DR,, RD), ee, (21) 


where II,,;; is defined in the expected fashion. Finally, 
'we must consider the cases wherer = zands ~ jorr #2 
and s = j. In these cases we must compute p(f,, R;, R;) 
(assuming s = j). However, referring to the correlation 
matrix M of (13) it can be shown by a lengthy but ele- 
mentary calculation that 


(R,, Ri, R)) = 0, (22) 


G=0 


ay 
(a. reas 


| [cf. (15)]. 
| Now turn to the computation of p(R,, R,). For this 
ease the correlation matrix becomes* 


Ge 0 G icos 2 2G..sime 
hur 0 Ge. —G,,sinz G,, Cos x (23) 
G,, cosz —G,,sin x Gr 0 
Go sin 7 Gey cose 0 Gas 


4¥For the case G,, = Gs this is similar to a result of D. Middle- 

ton, ‘Some general results in the theory of noise through non-linear 

devices,’ Quart. Appl. Math., vol. 5, pp. 445-498; January, 1948. In 
particular, see pp. 469-472. 


Miller and Bernstein: Analysis of Coherent Integration 


241 
where 
CT eae) (24) 
If A,, is defined as 
eee ee Ch (25) 
eae Ge 


then an elementary calculation shows that the determinant 
of M is det M = A*, and the cofactors of this symmetric 
matrix are 


My, = My = Me. = MG eG 

Mo. M3, = 0 = Mi, = ALG sme 
My, = My. = —A,,G,, cosa. 

The joint distribution qU¢,, Is,, Ics, Is.) is thus 


qe, Tiare lees Is.) 
1 1 2 2 2 2 
a Te exp Bove [G..(Le, 4 T3,) —- CAVES. ae Vine) 
a 2G;,s sin r(LoTs, my Is,Lcs) 
= 2G; cos AC herd Bee ap rien. 


The Jacobian appearing in (14) is J = R,R, and hence 
the marginal distribution p(F,, R,) is 


ee oul 
p(R,, Jie) — 4 AN 3 exp | DAS (Gee = GR | 
x / iE exp (eee [sin x sin (6, — 6,) 


+ cos x cos (6, — 01] dé, dé, 


RR, 1 
- alan ae Ee (GR, 


& ny (26) 


rs 


where J, is the Bessel function of the first kind and order 
zero with purely imaginary argument. 
One now sees immediately that 


F) 
(i pt, De aed 0 (27) 
and 
oe 
~ p(R,, R, 
(fr mR. R)) 
2 Pes: )( - 
Pk Na ae, | o9,, : 2V,, ce 


where we recall that G,, = W,,. Eq. (17) then becomes 
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p(Ri,-:: ,R,) = |] p.(Rd 
t=] 


2 


I< 
seg GR; Ga aa 


,8=1 


pi 
OG. 


8>r 


= TT pte) + TT aR) 


Gr ~ He) - 2) ee 
8, Wie. ( 2V,,, : QV, x , ce) 
The cumulative distribution C(7) of (9) is thus 
ay, <2 ' i 4 : Ge 
OAR Ge OR ae 
“exp Fae ue L)) Ta + Sr (30) 
2 ak Neale 
[ae 
where 
A Vise. Sears ea, 
B= it len — ih Ae, dR, 
0 0 WV, 
= (1 —_ Can (31) 


(wot Aw)sin(wotAw)t -wosinwot 


Hil scor eye eelle ; e(t) 


ea 
Cc Aw(2wotAw) 


Fig. 3—Lossless tank circuit used as an approximate ideal coherent 
integrator, 


D. An Approximate Ideal Coherent Integrator 


An example of a physical device which performs almost 
as an ICI is furnished by a lossless tank circuit as illus- 
trated in Fig. 3. The ratio of the output voltage to the 
input current is 


sL 


DS) a ie 


where s = jw. We call the inverse Laplace transform of Z(s) 
the impulsive response of the network. It is 


he) = 5 COS Wot, i710 


Hib —=" Oy 


where w, = 1/LC. If we let? C = 1/2, then the impulsive 
response becomes 


h(t) 


£< 0 


2 COS wot, 10 


= 0, ee i) 


5 The output of the network for an input 2(t) = sin wot is e(t) = 
(t/2C) sin wot. Since we want to approximate an ICI as closely as pos- 
sible, we choose C so that the coefficient of ¢ is one, that is, 1/2C = 1 
or C= 4. 
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By the superposition theorem, the voltage e(¢) across — 
the tank circuit due to a current 7(t) applied at t = 0 is 


a) = [nei — 8) a 


For example, if 2(¢) = sin wot, then e(f) = ¢ sin wof. If 


a(t) = t sin wot, then 


cotta 


SIN wot i COS il ‘ 
Dae — a 5 f sin wot 


a(t) = s # sin wot + ur, 


for large w. and/or ¢. If 7(t) = cos (wy + Aw)t, then 


me 2(wy + Aw) sin (wy + Aw)t — 2wo sin Wot | 
/ 


ay) Aw(2w) + Aw) 

When the deviation Aw of the applied frequency from the | 
resonant frequency w,) is small compared with the resonant 
frequency; that is, | Aw | &K wo, then to a first-order 


approximation we may write 
2 SIN (wy + Aw) £ — 2wo SIN woe | 
2w Aw 1 
see! ) 
Awt 
= —-|a 
es cos (ast + 9 ) ; 
5 | 


which is identical with (3). 


EL. Statistical Properties of an Approximate ICI 


Suppose that stationary Gaussian noise 7(¢) with mean 
zero and correlation function ¥(7) is applied to an approxi- 
mate ICI with impulse response 


RO = 2acosiens, 


iO 
be<n0: 
Then the output is, by the superposition theorem, 


Kt) = [me = i a 


It is readily seen that (/(t)) = 0. The variance of I(t) 
may be computed from the formula 


(eo) = ff Me = ane = 9 WOO) ae ae. 


Following the notation of Lampard,° 


(PO) = 4 [ 6 — 2; 2)¥@) ae (32) 


where $(¢; 7) = $ fo h(E)h(E + 2) dé is the time-dependent 
filter correlation function. We readily compute 


o(t — x32) = (t — 2) cos aor 


ae * sin wo(t — x) COS wot. (33) 
0 


6D. G. Lampard, “The response of linear networks to suddenly 
applied stationary random noise,’’ IRE Trans., vol. CT-2, pp. 49- 
57; March, 1955. 
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f w(f) is the power spectrum of the 7(¢) process, then 


W(x). = tis w(f) cos wa df 


Kr) = [wierd +t) + 0 = fol af 


+ 2 cos 2xfet [| wiNFY + AFUE — fo) a (34) 


where F(x) = (sin atx) /rx, [cf. (12)]. 
If w(f) is white, w(f) = wo and 


(() = ra] 4 Sat] (35) 
2 

This formula could also have been more readily obtain 
\by noting that for w(f) = wo, ¥(x) = Wo 6(v) and hence 
(82) yields 


(P()) = woe(t; 0). (36) 


Applying (33) then yields (35). 

If z(t) is now introduced into two distinct approximate 
ICI, say of resonant frequencies w, and w,, then their 
outputs will be 


1) = [ nei - 9 a 
| and 


L() = [ nl@ilt — 9 ae 
F 0 
respectively, where 


h(t) = -2 Cos t,t, (37) 


h(t) ==s2 cosia,t. 


Again following Lampard, define V,,(t;7) = ,()2,(£ + 7)) 
as the time-dependent correlation function. Since we are 
interested only in the outputs at the same time ¢, we write 
‘Y,. (0) = (/,(t)I,(t)). One then finds by a direct calcu- 
lation that 


HO) = cos mt, +1) [ wOIRG + HPO = 10 


| te Gace) E Gi te) laf 


+ cosa, = 1) [ wlFG + FG +1) 


Se i tf) tes 1.) 1 GF. 


where F(a) has been defined in (12). Note that if w, and 
w, are commensurate frequencies, W,,(¢) vanishes for 
suitable values of t—again with no restrictions on the 
form of the power spectrum w(f). In particular, if w(f) = 
Woy. then 


(38) 


sin (w, + w,)t sin (o, — wt | 
+ ; 
W, = Ws @®,; — W, 


mel 
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which could also have been easily obtained by setting 
W(x) = 2Wo 6(2). 


III. ImproveMENT OF THE SIGNAL-To0-Noise Ratio 


The signal to be detected is assumed to be a constant 
amplitude, constant frequency sinusoid x(t) which is 
additively combined with stationary Gaussian noise, 
a(t). We shall assume that 7(¢) has mean zero, correlation 
function ¥(7) and power spectrum w(f). 

The input to an ICI of resonant frequency w, is a(t) + 
a(t). As in Section II-A, the constant amplitude signal 
x(t) may be written as 


(39) 


where a and 6 describe the envelope, and following Rice 
(ef. Section II-B) 7(t) may be written as 


x(t) = a COSat + Hsin wot 


u(t) = ic(t) COs wot + Z(t) sin wot (40) 


where 


(i) = (ie) = ts) = 0 
(@) = (ic) = (is) = Yo 


and 7~ and zs are independent Gaussian processes. The 
input to the ICI of tuned frequency wy, is thus 


e(t) = x(t) + a(t) = [a + tc(d)] cos wot 


+ [b + is(t)] sin wot (41) 
and by definition the output is 
eo(t) = cos wot | [a + tc(€)] dé 
0 
Fsinact | [b+ is] de. (42) 
0 
Define the input signal power Ps; as 
Psi = 3(@ + Db’) (43) 
and the output signal power Ps. as 
ra = $({'aa) + (f'04)] 
2 0 0 
= 5 f(a + 0). (44) 
The input noise power Py, is defined as 
Py = 3lic) + Gs)] = Yo (45) 
and the output noise power Py, is defined as 
il t , 2 t E 2 
Py2 = 9 Cf tc(€) ae| ) ie Cf is(€) ae| » 
0 0 
=| wprg-fa (46) 
0 
by (6) where F(x) = (sin tx) /re2 [ef. (12)]. 
The input signal-to-noise ratio p, is therefore 
ire @ oe 
= Si = 47 
saan ven 2 ee 
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and the output signal-to-noise ratio p, is 
9 9 9 
Pia t(a + bd) 
SSS SS ee 
Leaps no 


2] wihFG = f.) af 


(48) 


Clearly, if p» > p, our process has resulted in a signal 
enhancement. Thus a measure of improvement is given 
by the ratio of the signal-to-noise ratios; that is, by 


Vie 
Pf wnred = fe af 


(49) 


If we make the change of variable rt(f — fo) = x, then 
\ becomes 


arty 
; a eg sin a GY 
Gay 
—mTtfo wt v 
which, for large values of tf, is asymptotic to 
lo : 
os 51 
w(fo) oe 


Thus 2d increases linearly with time, and the longer the 
integration period ¢, the greater the signal-to-noise ratio 
improvement. 

As a practical example, consider the case where the 
noise has a band-limited power spectrum. That is, 


OD a 
DU 


Then if the integration time is 7, the signal-to-noise 
ratio improvement factor \ is kBT where 


? 1 Gree aee Ne 
kt == ===) wie. 
WS eis v 


For almost all values of the coherent integrators’ tuned 
frequencies f, except those near the edges of the frequency 
band B, the value of k is essentially one. 


OB 
Boa. 


Wo, 


0, 


IV. THe MuvuiricHanNNEL PROBLEM 


The signal-to-noise improvement factor \ is directly 
proportional to the duration of the integration period 
lef. (51)] while the effective bandwidth of an ICI decreases 
as the integration period increases [cf. (4)]. In certain 
important radar applications, it is known a priorz that 
if the signal occurs it must fall within a known frequency 
range which shall be designated the “signal band,” B. 
We treat the case where the signal band and noise band 
are identical. For example, in the case where the only 
cause of frequency difference between the transmitted 
and received signals is the Doppler effect produced by 
the target’s velocity, the signal band can be specified 
from a knowledge of the range of possible target velocities. 
However, since the exact echo frequency is not known a 
priort, we must be prepared to process an echo which 
occurs anywhere within the signal band. 
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The number of integration channels, n, required for 
the entire signal band increases as the integration period 
is lengthened. Each integration channel can independently 


cause a false alarm, although only the particular channel — 
in which the signal appears can cause a true alarm. | 


Therefore, the question arises as to whether it is desirable 
to lengthen the coherent integration period 7' to improve 


the signal-to-noise ratio, when so doing requires an in- — 


crease in the number of integration channels n and a 
consequent increase in the opportunity for a false alarm. 
Of course, the problem is to maximize the detection 


probability 1 — £6, or equivalently, to minimize the — 
missed-signal probability 8. Suppose that the signal — 


occurs in the rth channel. Then the output of this channel 
is of the form’ 


[Tc(t) + at] cosw,t + [[s(t) + bt] sin w,¢. 


Its envelope R is then the square root of the sum of the 
squares of two independently normally distributed random 
variables with common variance W,,(¢) and means at and 
bt respectively. The fr.f. of R is then 


(52) 


ied R —[R2+t2(a2+b2)]/2Wr, (ve -+- b? ~ 
p(R) = Fee (Ave h) oss) 
and the detection probability is 

1-s=f p®ar (54) 


where 7 is the acceptance threshold. 

The false alarm probability @ is related to the threshold 
7 by 

a=1-f ---f pit, +++, R,) aR, --- aR, (55) 

0 0 

where p(R,, ---, R,) is the joint n-dimensional distri- 
bution of the envelopes of the outputs of the coherent 
integrators in the absence of a signal. Theoretically, 
if we choose a and n, then 7 is determined by (55) as 
a function of the integration time 7, and thus @ is 
determined by (54). The problem, formulated math- 
ematically, is to determine the integration time 7, 
threshold 7, and number of channels n such that 8 is 
minimized. 

These quantities cannot be specified independently. 
The value of n is determined by how far down on the 
skirt of the integrator’s response characteristic, given by 
(4), the signal frequency will be permitted to fall before 
coming into the coverage of the adjacent integration 


7 Hq. (52) is written on the assumption that the input signal is 
of the form a cos w,¢ + b sin w,t; that is, the frequency of the signal 
is exactly the tuned frequency of the r th ICI. If the signal frequency 
is actually w, + 6, then the terms (a? + 6?) in (53) must be re- 
placed by 


eeroch a 
SHON me 


2 2 2 
aCe shies): 


ow 


2 


However, for small 6w, (a? + 6?)t? is a good approximation to the 
above expression. 


(957 


phannel. For the present analysis, we shall assume that the 
ntegration channels overlap at a frequency such that 
Awl = 2/2. This causes the amplitude of E(7, Aw) to 
e equal to minus 0.912 db at the cross over frequency. 
he largest channel separation that reasonably might 
»e chosen is one that causes Aw7’ = 2r, since adjacent 
hannels would then overlap where E(7, Aw) = 0. How- 
ie this choice is not very satisfactory because a sub- 
stantial portion of the coverage of an integration channel 
(a include frequencies for which E(7’, Aw) is far from 
unity. The choice of Aw7’ = 2/2 is probably the most 
satisfactory since the integrator’s response is almost 
uniform in the frequency band it is assigned to cover. 
For this particular choice, it is found the number of 
integration channels required is 


n = 2BT (56) 


where B is the signal band in eps and 7 the integration 
ee in seconds. Thus 7 = 1/2f, where f, = 2Af is the 
eparation in cps between adjacent channels. 

If it is assumed that the input noise has a band-limited 
white spectrum (of width B cps) then ¥,,(é) can be 
determined. For in this case, 


| w(f) = Wo, OF Gb 
w(f) = 0, B<f 


‘where w(f) is the power spectrum of the input noise. It 
is then not difficult to see, that except for very small 
values of ¢ and w,, that with but negligible error 


Ne (AD) = se. 


(57) 


(58) 


where 7’ is the coherent integration time (cf. Section III). 
If T = 1/2f,, then the G., terms of (80) will be small 
if |r — s| is large. If |r — s| is small and odd, G,, will not 
be negligible compared with W,,. However, numerical 
calculations seem to verify the conclusion that even for 
these cases and large n the second term of the series is 
small compared with the first. Hence to a good approxi- 
mation [cf. (55)], 


On = il aa (1 = ar 


= eT oe (1 (1 = a)? ). (59) 


Substituting this value of 7 in (54) and recalling that 
n = 2BT by (56) we have, after a change of variable that* 


| : ; ; Ee 
| B= i SO eed OY) /2Qn) du (60) 
| 0 

where 

| LEG. se a) a 

| t= 2Qnwy py ae end 


8 This is essentially the Q function of J. I. Marcum, “A Statis- 
tical Theory of Target Detection by Pulsed Radar: Mathemati- 
cal Appendix,’’ The RAND Corp., Santa Monica, Calif., RM-753; 
July 1, 1948, reissued April 25, 1952. In Marcum’s notation, 


our 6 is1 — Q(V2Qn, =). 
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[cf. (47, 48)], and 
ES A/S Diloe il ean a 


Thus, the problem is reduced to the determination of one 
parameter, namely n, such that 8 is minimized. In the 
Appendix, 8 is examined as a function of n. It turns out 
that 8 is not a monotonic function of n. In fact, 


(61) 


linen 6} == (0) == Ihnen 6 


n—>0 no 


(62) 


and 6 has a unique maximum at the only positive root of 


dé _ 1,(b8) 
db I,(bé) 


(63) 


where b = 2Qn. For example, if the over-all false alarm 
probability a is 0.1 and the initial signal-to-noise ratio 
p, = 0.01 = — 20db, then the worst number of channels 
to use is 53 or 54. 

However, for the radar applications in which we are 
interested, (cf. Figs. 4-11, pp. 246-247) maximum value 
of 6 occurs at a value of n less than one. Thus, since the 
number of channels must be an integer, the missed-signal 
probability 6 is a monotonic function of n. The best 
results are therefore obtainable by using as large a number 
of coherent integrators as is practical. Numerical results 
are presented graphically in Figs. 4 through 11. The 
over-all false alarm probability @ is constant for each 
graph, while the initial signal-to-noise ratio p, varies from 
curve to curve, with the missed-signal probability 8 
plotted vs the number of channels n. 


APPENDIX 


The fundamental formula for the missed-signal prob- 
ability is given by (60) as 


g 
he / ie ba) ate (64) 
0) 


where 
by = V2Qn, 


and @ = 1 — a: 
The integrand of (64) is non-negative and hence, 
Case 1, 


ay Soe (ea ee) 


@® 
IV 
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Since 
p(u) — en eal «(bie 
= 0, 


u>oO 
Mm KW 


is a frequency function it is seen that for b fixed and 
finite, Case 2, 


/ p(u) du = 1. 
0 


The first nontrivial properties of 6 that will be demon- 
strated are, Case 3, 


lm 6 = 0 


n-0 
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and, Case 4, A sequence of inequalities will be deduced in order to 


ewe establish Case 4. From Stirling’s formula for the factorial, 
im 8 = 0. 
i 2"*(n!)? > (Qn)! 


From (65), lim,.9 6 = 0 and lim,..5 € = 0. Thus (0) = 0 for all: pakitivelmtorme mem areaee 


since the upper limit of the integral, becomes zero 

2n 20) 
while the integrand remains bounded, Gn fact it 1s pre- TO: (66) 
cisely we “’*). This proves Case 3. 2*(ni) ~~ (2n)! 
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Figs. 4-11—Missed-signal probability 6 vs number of integration channels n with initial signal-to-noise ratio p; as parameter. 


For all nonzero x. The Bessel function J,(x) is defined by 
Le) = hs (2) 

| eh) Ne 

and using the inequality of (66) 

J ERGO eem Le <EC Pel ON aedia > Os 

[Of course J,(x) is asymptotic to (1/ V/ 2nx) e*; but in the 


resent investigations, strict inequalities are desired.] 
hus one may write (64) as 


: : 
0 <s B(n) = i We. AT bu) du <a / ICR du 
o 0 


g 
at ik ie ay. (67) 
0 


It is now necessary to obtain an inequality on & From 
the binomial theorem the inequality 


eg 0a ve 1 8 


is readily deduced. Letting x = 1 — a and recalling that 
0 <a < 1, the above inequality becomes 


: — 
| eae See a 
ge n 


—log 1 — a") S —log (1 — a) + logn. (69) 


Since for n sufficiently large, log n > — log (1 — a) one 
may write (69) as — log (1 — a") < 2 log n. Hence 


Fig. Over-All False Alarm Probability Missed-Signal Probability 
4 a = 0.05 Bos 005: 
5 a = 0.05 Bs OOK, 
6 a = 0.01 (i 8 O.5, 
7 a = 0.01 6h <= (0.05), 
8 a = 0.001 i Ss O55. 
9 a = 0.001 B S 0.05. 
10 a = 0.0001 Bass 05: 
il a = 0.0001 (3h SOO. 
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for n sufficiently large and (68) becomes 


2V logn 
Bin) < i) eg aa mene (70) 


Now e?“"»* will assume its maximum value when 
|b wu] is as small as possible. For n large 6 is always 
greater than u for 0 S$ u S$ 2 Vlogn. Thus| b — u| > 


|b — 2 Vlog n | for n large and from (70) 
2Viogn nk 

Bn) < / 2V/log n e72e-2V 108» Fs 
0 


—3(V2Qn-2V log op) 


= A(log n) e 

But V2Qn > 4 V log n for n sufficiently large. Hence 
eg leg 2 4 
n n 


B(n) < (4 log ne? *°2” 
for n sufficiently large, and clearly 
lim B(n) = 0. 


It is clear from Cases 1, 3, and 4 that 6(m) must have 
at least one maximum. To determine this maximum the 
derivative d@/db will be calculated and set equal to zero. 
From (64), 


3 Seay a 
’ = i wen tP)/2 | —b1.0n) + db T(o | du 


+e eT Gl) 
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From the identity d/,(bu)/db = ul,(bu) between the 
Bessel functions 7, and J,, the second part of the integral 
of (71) becomes 


é é 

; Sia payee O SORA P: 

Ka] hat re? ab T,(bu) du= | ue POAT (bu) du. 
“0 


70 


Upon integrating by parts and recalling the further 
identity d [ul,(bu)|/du = bul,(bu), we obtain 


ae 
K = —te **7 (of) + 6 [ Ue” A (Gw) dat. 
70 


Substituting this formula in (71) yields 


dp _ 
db 


dé 


gonad 10 


and 6 has a stationary point at the value of n which 
satisfies the equation 


de _ 1008 : 
db I,(bé) (72) 
Now 
1/n 
dé_ 2 (tog | a a 
db bén a Ae 
and thus (72) may be written as, Case 5, 
T,(bé) J 2 ( : i Ghee a . 
I,(bé) log Op le ae (73) 


It is not hard to see that 
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for alla withO < a < land alln > 0. Also 


(1 r 5) ae 
i Glew ars 


is asymptotic to 2 as n approaches infinity while 


I, (08) 
Ty(bé) 


as n approaches infinity since [,(b&)/Jo(bé) is asymptotic 
to one. From this discussion of the functions appearing in 
(73) and their convexity (which is most conveniently 
investigated graphically but can also be done analytically) 
it can be concluded that (73) has a simple root, say at 
n = N. Thus @ has a single maximum at 2 = Np. 

To summarize, the missed-signal probability 6 is 
asymptotic to zero as the number of channels 7 increases 
without limit and has a maximum at n = 7 where 1 
is the only root of the scalar equation (73). 
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The Sequential Detection of a Sine-Wave Carrier 
of Arbitrary Duty Ratio in Gaussian Noise’ 


H. BLASBALGt 


Summary—lIn this paper the Wald theory of sequential analysis 
is applied to the detection of a sine-wave carrier of arbitrary duty 
ratio in Gaussian noise. This is a generalization of a familiar prob- 
lem. The detector law for the problem is obtained. In particular, it 
is specialized to the important cases: 1) arbitrary duty ratio and 
signal-to-noise ratio less than unity and 2) duty ratio much less 
than unity and peak-signal-to-noise ratio much greater than unity. 
For the latter case, it is shown that the best detector law goes over 
into a Bernoulli detector. In the former case it is shown that the 
only important parameter in detection is the average signal-power 
to noise-power ratio. For the case of unity duty ratio the detector 
law goes over into the familiar log J(nr) characteristic. Further- 
more, for threshold signals, it is shown that, in general, a first-order 
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approximation to the logarithm of the likelihood ratio (or to the 
detector law) does not permit the sequential test to converge at one 
of the threshold parameters. A second-order approximation is always 
required. Curves of the operating characteristic function and the 
average sample number function are given for threshold signals. 


I. Iyrropuction 


N this paper the detection of a pulsed carrier of 
if arbitrary duty ratio in Gaussian noise is considered. 
To be more specific, we consider a set of observations 

at the independent time intervals At,, At., --- such that 
when a pulsed carrier of duty ratio d is transmitted, on 
the average, d per cent of the intervals are occupied by 
samples of carrier-plus-noise and 1 — d are occupied by 
noise only. In statistical language, d is the probability of 
measuring a signal-plus-noise sample in an arbitrary time 
interval on condition that a pulsed carrier is transmitted. 


) 
) 


y 
) 


\957 


n the absence of a pulsed carrier all the intervals are 
ccupied by noise. This corresponds to the case d = 0 
nd signal-to-noise ratio 7» = 0. We wish to find the 
equential filter law for detecting the presence or absence 
of a pulsed carrier of arbitrary duty ratio0 < d < 1 and 
eak signal-to-noise ratio 7 > 0, in Gaussian noise. It is 
lso desirable to investigate the detector law for those 
anges of duty ratio and signal-to-noise ratio which are 
f practical interest. 

An obvious application of this problem is the detection 
f pulse system activity in different regions of the 
pectrum. An electronically tuned receiver can be used 
o search the spectrum for pulse signals. Everything being 
qual, the more pulse signals that will appear in agiven 
requency band the faster signal activity will be indicated. 
ence, more time will be spent in those bands where the 
signal density is low and less time will be spent in the high 
ensity regions. 

Another application of this detector is to the detection 
f targets in space by a high-speed search radar. In this 
application, the target detection time will decrease with 
the density of targets at a given range or the higher the 
Hensity of targets the greater the range at which detection 
ran be realized. This detector automatically trades duty 
ratio and signal-to-noise ratio for observation time in an 
efficient and natural way. It is certain that other appli- 
cations exist. 


| II. Brier FoRMULATION OF SEQUENTIAL 
Detection THEORY 


| A sequential! test [1], [8] can be described as follows: at 
~ = t, an observer measures the sample value 7,. Based on 
this value he must decide whether to accept the hypoth- 
esis H, that the parameter of the distribution is 7 < 7 
or H, that 7» > m, (no < m) or he must decide that the 
datum is insufficient to accept either one of the hypotheses 
with confidence. If H, or H, is accepted, the experiment 
is terminated. On the other hand if the datum is in- 
sufficient to lead to the acceptance of one of these hypoth- 
eses, then at ¢ = ¢, the observer takes another sample 
rz. Based on the sample of size two (71, 72) the observer 
must once again make one of three possible decisions: 
accept H, or H, or the datum is insufficient for accepting 
either. If the hypothesis H, that only noise is present or 
H, that signal-plus-noise is present is accepted, the 
experiment is terminated. If the data are insufficient for 
a decision, 7; is observed. The same decision procedure is 
then repeated on the sample point (7,, 72, 73). Experimen- 
tation is continued in this manner until either H, or H, 
is accepted. It is clear that the number of samples required 
for the termination of a sequential test with the accept- 
ance of the hypothesis Hy or H, is a random variable. 

In a sequential test the observer specifies the error 
probability a of accepting the hypothesis H, when H, 
is true and the error probability 6 of accepting H, when 
H, is true. From the class of all sequential tests, the 
decision rule which minimizes the average number of 
samples required for termination at the threshold param- 
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eters » = no and 7 = 7, is chosen. The rule for inde- 
pendent sampling is given by the following: accept H, 
that noise is present, at that value of the sample number 
n for which, 


. P(r;, m) ny) 
] b 
2 OE BG, ae) Bar ee 


and accept H, that signal-plus-noise is present at that 
value of n for which, 


Slog, 


pe ae - > log 7—f a, (2) 

P(r, n) = probability density function of the random 
variable r when 7 is the true parameter, 

a = probability of accepting H, when H, is true, 

1 — a = probability of accepting H, when H, is true, 

8 = probability of accepting H, when H, is true, 


1 — 8 = probability of accepting H, when H, is true. 


The left side of Eq. (1) or (2) is the logarithm of the 
probability ratio or likelihood ratio. 

The two most important characteristics for judging 
the performance of a sequential detector are the operating 
characteristic function (oc function) and the average 
sample number function (asn function). The oc function 
L(n) gives the probability of accepting the hypothesis 
H, as a function of the parameter 7 of the distribution 
function. It expresses the confidence in the decision. 

The mathematical expression for this function is given 
by the pair of parametric equations, 


cons: 


Oe oie SRS a © 
a Urea 
and, 
EC) =e (4) 
where, 
z = log pee (5) 
E,{e"*] = expected value of random variable e”’. 


For a given a, 8, L,(h) is a universal function of h. The 
parameter 7 of the distribution is related to h by (4). 
From (3) and (4) the oc function L(7) can be obtained. 

The asn function gives the average number of samples 
required for the termination of a sequential test as a 
function of the parameter 7. It is an expression of the 
cost of experimentation in terms of the number of samples. 
The mathematical expression for the asn function is given 


by, 


Bee hi ae 
eB 


E(2) 


L(x) log ; L(n)] log + 


E,(n) = (6) 


At the indeterminate point h = 0 corresponding to a value 
of 7 = 7’ at which H(z) = 0, we have for the oc function, 
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and for the asn function, 
(Iog Nes Jog 3 *) 
7 d a panes ; 
ae Ei’) (8) 


The oc and asn functions are approximations since the 
excess over the boundaries of (1) and (2) is neglected. 
The approximations are however extremely good for 
practical applications [8]. The results of sequential theory 
briefly stated will now be applied to the detection of a 
pulsed sine-wave-carrier of arbitrary duty ratio in Gaussian 
noise. 


III. Tut Derecrion or a SINE-WAVE CARRIER 
oF ARBITRARY Dury RATIO IN GAussIAN NOISE 


The probability density function of the envelope of 
Gaussian noise after passing through a narrow band-pass 
filter is given by [5] 


ma 5) 
P,(r) = re’ (9) 
where, 
r = R/oy = normalized random variable, 
R = envelope voltage sample, 
oy = rms value of noise at input to band-pass filter. 


The probability density function of the envelope of a 
sine-wave carrier plus noise is [5], 


Pg te 1G) (10) 


where, 


A , : . 
n = — = peak-signal-to-rms noise ratio, 
on 


© 2Qk A 
“(nr\" 1 _ zero order modified Bessel 
Sts in p> ( *) (k!)” function. 


When 7 = 0, (10) reduces to (9). 

Let d be the duty ratio of the pulsed carrier in the 
sense described in Section I. In the absence of a carrier only 
noise is present. Hence, the probability density is given 
by (9). The probability under H, of the random variable 
r exceeding some arbitrary threshold is equal to the 
probability of occurrence of signal-plus-noise in any one 
of the intervals times the probability that signal-plus- 
noise exceeds the threshold plus the probability of observ- 
ing noise times the probability that the noise exceeds the 
threshold. The distribution function of a pulsed carrier 
in noise is therefore, 


P, ar) = aP,(r) + [1 — dJPo), Orda (11) 


where P,(r) and P,(r) are given by (9) and (10). From 
(11) it is seen that d = 0, 7 = O simultaneously, ord = 0, 
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sistent with the manner in which the problem is 
formulated. 

In order to detect the presence or absence of a carrier 


in noise, we test the hypothesis H, that the signal-to-noise © 
ratio is 7 = 0 (implying d = 0) against the alternative | 
hypothesis H, that 7 > 0 (implying d > 0). Let m > O- 
and d, > 0 be some numbers arbitrarily close to zero. Then 
for practical reasons, we must test the hypothesis » = 0 - 
against 7 > ,. This also implies d = 0, d > do. In order to— 
obtain the decision regions, we make use of the general - 


results given by (1) and (2). The logarithm of the prob- 


ability ratio corresponding to a particular sample 7; | 


is obtained from (11) as, 


Pa) Paes) 
Po(r;) Bice E a5 a Po(r;) 


From (9) and (10), the ratio 
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Corresponding to an observation of mn independent 


samples we have from (1), (2), (12), and (13) the decision | 


regions, 


a loge [P= do(e*"**Io(mri) = 1) 


t=1 


< log ~~ (14) 
and, 
De log [1 + do(e“*"**Io(myr.) al) 

> log FH, (15) 


The arrow indicates the hypothesis to which the decision 
region belongs. For the special case, d = 1, which corre- 
sponds in Section I to the case where all the observation 
intervals are occupied by signal-plus-noise (when signal 
is in fact present) the decision regions reduce to the 
familiar form, 


B mm 


Dy log Io(mri) < log 


[=1 eareue Bye (16) 
and 
n ] a z 
2 log (nin) > log" —H +H. (17) 


The detector for the special case above is the familiar 
log Io(mr) detector. This result has been obtained when 
considering the detection of a target at a particular range 
by means of a pulsed radar [6]. The detector law given by 
(14) and (15) for the more general case is a complicated 
computer. Furthermore, machine methods are required 
to calculate the oc and asn functions. We will therefore 
specialize the results to those ranges of the parameters 
dy and n, that are of practical interest. 


December — 


0, all lead to the distribution P(r). This is con-_ 


= 1) |. (12) | 


| 


} 
| 


| 


| 
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IV. SequentiaL Derecror Law AND CHARACTERISTICS 
For A Putsep Carrinr or ARBITRARY Duty Ratro 
IN Notsx or Stanat-To-Noise Ratio n, < 1 


The case now considered is the detection of threshold 
signals in noise, or signals of arbitrary duty ratio and 
signal-to-noise ratio less than unity. Consider a particular 
term in (14) or (15), 


a= lor |) dates” Ioan) — 1) I. (18) 


(The first two terms in the Taylor series approximation 
ito the logarithm of (18) are given by, 
do 


ides fon) = 1) = = le Ia — 11 


9 (19) 


‘Let us further expand (19) in a Taylor series in powers of 
im by using the expansion for J(m,r). Then, neglecting all 
orders higher than the fourth leads to 


dont mh 
2(r; do, m) Are 2 a d = do) 
devin? [, _ 2 
+ eat | Ste ‘| 


dont E “ | 
a 32 ay eo: 


gq. (20) gives an approximation to the detector law for the 
case where the signal-to-noise ratio is 7, < 1 and for 
arbitrary duty ratio. Eq. (20) shows that the detector 
law contains a bias given by the first term and a linear 
‘combination of second and fourth order terms. The 


reason for including the fourth order term will become 
‘evident shortly. For the special case, d) = 1, we have, 


(20) 


2.2 4 4 
QE 
4 64. 


2 
2(r;1, m) = = + (21) 
This expression is equivalent to that obtained in [4] and 
serves as somewhat of a check on the validity of (20). 
Let E,,2(z) be the expected value of 2(7; do, 71) on 
condition that a carrier of signal-to-noise ratio 7 and duty 
ratio d is present or 
: Bya@) = [ 200; do, m)Py a) ar, (22) 
| 
where P,,4(7) is defined by (11) in conjunction with 
(9) and (10). From [7] we know that, 


| E,(r?) = / P(r) dp = of + 2, (23) 

| 0 

where P,,(7) is given by (10) and, 

E.G). = i rP,(r) dr = 8 + 87° + 7. (24) 
0 

| 


From (11) and using the above results we have, 
E,,a”) = d(2 + 1°) + 20 — d), 


2+ dr’ (25) 
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and, 
Ear) = d8 + 89 + 9) + 80 —d) 
8 + 8d’ + dn’. (26) 


Making use of (25) and (26) and applying these to (22) 
yields 


ddonm — don 


Hyde) = | 8 


ape ale (27) 
In this development certain terms have been neglected. 
However, (27) is a good approximation for the range 
of values indicated and in particular in the interval 
0 <7 < n,. Eq. (27) gives the average value of z(r; do, 7). 
When d = dy) = 1 and yn = 7, we obtain the result equiva- 
lent to that obtained in [4], 


is 


E,,@ == (28) 
Similarly, when 7 = O we obtain, 
E,@) = —2. (29) 


This once again is a check on the expansions. 
Suppose we consider only the first term in the expansion 
of the logarithm of the likelihood ratio given in (19). 


Zo ~~ dole?” Io(nir) soil (30) 


If we take the average value of 2) on condition that only 
noise is present we have, 


Eo (Zo) = a, | (AG) dr 


10) 


— df dr =0, G1) 
0 

since the integrals represent the areas under the given 
probability density functions. The asn function given in 
(6) contains the parameter H(z) in the denominator. 
For the approximation considered in (80), L,(é) = 0 
at 7 = 0. Hence, the asn function is infinite at this point 
or the sequential test does not converge in probability 
when noise is present. We therefore conclude that at least 
two terms in the approximation to the logarithm of the 
likelihood ratio are necessary for the sequential test to 
converge. This requires that for small signal-to-noise 
ratios the detector law must include a fourth order term 
as well as second order and constant terms. This property 
was also observed in [4]. We will later prove, in general, 
that a sequential probability ratio test never converges 
at the threshold parameter 6 = 6, when a first order 
approximation to the logarithm of the likelihood ratio 
is used. 

The statistical decision regions for the acceptance of 
the hypothesis H, that noise is present or H, that signal- 
plus-noise is present is obtained from (1), (2), and (20) as, 


n 


SS Aare do, m1) < log 


i= ae 


— H,, (32) 


IRE TRANSACTIONS ON 


B 


SEY ear enw lee Wt Ae (33) 


V. THE OPERATING CHARACTERISTIC FUNCTION (OC 
FUNCTION) FOR 7; < 1 AND ARBITRARY Duty Ratio 


The oe function is given by the parametric (3) and (4) or, 


(t= 8) os 
IE esa aco 


(aera 


He") = 1, 


L,(h) = (34) 


and, 
(35) 


where z is given by (20). Let us expand (35) in a Taylor 
series and neglect all orders higher than the second. For 
the problem considered we have, 


£, sa(2) i 
Ei, al’) 


The numerator of (36) is given by (27). The denominator 
can be obtained from (20) with the help of relationships 
(25) and (26). Neglecting orders in 7, higher than the 
fourth, we have, 


h(n, d) = —2 (36) 


h(n, d) = Wie Lee ee) 
In the region 0 < 7» < 7m, which is actually the region of 


interest, we have, 


a ae 
—2—(4) +1. O<n<m. 


h(n, d) =e do n 
1 


(38) 
(This is a good approximation when 7, < 0.3). 

At the threshold parameters 7 = 0 we have h = 1 and 
when d = do, n = 7, we have h = — 1, as required. This 
serves as a partial check of the approximation (38) since 
for all sequential tests h takes on these values at the 
threshold parameters. When 7 = 7, and d is some arbi- 
trary number we have, 

d 
h(m,, d) = —2—+1. (39) 
do 
Substituting (38) into (34) we obtain an approximation 
to the oc function as, 


i [—2(d/do) (n/n1)? +1] 
ramen ye 
a 
L(n, d) = 38 {[—2(d/do) (n/n)? +1) 8 [-2(d/do) (n/m1)? +1] } 
ca siirard 
0 <7 < 0.3.9, .40) 


We now define the following: 


P, = don; = minimum detectable average signal-power- 
to-noise power ratio, 
P =dn = average signal-power-to-noise-power ratio 


actually received. 
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Then (40) becomes, 


i a. ik 


a 


[-2(P/Po) +1] [-2(P/Po) +1] 
a l—a 


It is therefore seen that when the peak-signal-to-rms-noise 
ratio is less than unity the error probabilities given by the 
oc function depend only on the average power. At the 
indeterminate points P’ = P,/2 or (dn)’ = do 7;/2 we 
have from (7) 


LG ee 


log : = B 
EBA) = SE = (42) 
log bine + log Lee 
a B 
Fig. 1 is a curve of the oc function fora = 6 = 0.01 and 


minimum detectable signal-power-to-noise-power ratios 
P, = 0.010.010. 
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Fig. 1—OC function for sinusoidal carrier of arbitrary duty ratio in 
Gaussian noise of signal-to-noise ratio, 7 < 1. 


VI. Tor AVERAGE SAMPLE NuMBER (ASN) FuNcrion 
FOR 7, < 1 


The general expression for the asn function is given by 
(6). When specialized to the problem considered we have 
from (27) and (6). 


E,, a(n) 


L(n, @) log 2 — + [1 — L(n, d)] log 
2 re OL: a 


d don'm, a! dont ; 
4 8 


(43) 


where L(n, d) is the oc function given in (40). In terms of 
the parameter P previously defined we have, 


L(P) log == + [1 — L()] log —* 
Ae aro! 


At the indeterminate point P’ = P,/2 we obtain from (8) 


(41) — 


1- 1 — 
(10g ‘ B (10g B a) 


(45) 


t the threshold parameters P = 0, P = P, it is seen that 
he asn function is inversely proportional to the average 
ignal-power-to-noise-power ratio squared or the fourth 
ower of the signal-to-noise ratio. Fig. 2 is a curve of the 
sn function for a = 8 = 0.01 and two values of threshold 
verage power P, = 0.01, 0.10. 


8xlo4 Bxi0% 


7xl0%, Tx103 


6x04 6xi0% 


5x10*5x108 


4x10*, 4x10% 


3x10*, 3x 10% 


2%10* 2x108 


ix10% \x10% 


| C0 04 08 AZ) p 16 .20 24 
(6) 004 008 012 016 02 024 
Fig. 2—ASN function for sinusoidal carrier of arbitrary duty ratio 
in Gaussian noise of signal-to-noise ratio, 7 < 1. 


VII. To SHow THAT A SECOND OrDER APPROXIMATION TO 
Tue LOGARITHM OF THE PROBABILITY RATIO IS 
REQUIRED FOR A SEQUENTIAL TEST TO 
| CONVERGE AT THE THRESHOLD 

PARAMETER, 0 = 65 


Let 
z= EG 6,) 


= 10g P(r, 80) ) (46) 


be a particular term in the logarithm of the likelihood 
ratio corresponding to a test of hypothesis @ < 4) against 
se 6;,, (0, > 6). The functions P(r, @,),°P(r, 6) are 
probability density functions belonging to a one para- 
meter family of density functions. Expanding z in the 
familiar Taylor series for the logarithm yields, 


) 168-9) 


y ~~ ( (r, 91) — 
y AVEC TS) 


(47) 
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The expected value of z on condition that 6) is the true 
parameter upon neglecting third order terms in z is 
simply, 


Pr 0) Gea) I 
LAG oy) 


1 ao 
m@=-5f | dr. (48) 
The first term in the expression (47) contributes nothing 
to EHy,(z), since, 

/ Peeper = / PG, 10:) dpiaeten ano) 
Since the asn function, (6), contains H,,(z) in the de- 
nominator it follows that at the point 6 = @, Hy,(n) = ©, 
if only the first term, 


(1) moe de oy) 


Sy CAT Sis ie (50) 


is used in the approximation (47). At the parameter point 
6 = 6, the expected value of (50) does not vanish. 

Although the asn function is an approximation since 
the excess over the boundaries is neglected we claim that 
the proof is still valid. Neglecting the excess over the 
boundaries at the termination of a sequential test intro- 
duces an approximation in the numerator of the asn 
function only and not in the denominator. The expression 
,(z) in the denominator is exact and it is shown here 
that a first order approximation to z yields E£,(z) = 0 at 
n = 0. This is actually the cause of the divergence of the 
sequential test at 7 = 0. Furthermore, when 7, <« 1, the 
excess over the boundaries at the termination of the 
experiment is negligible and hence the expression for the 
asn function is almost exact. 

As an example consider the detector law given in (16) 

z = log Io(mr) — 3m. (51) 

A first order approximation to log Jo(mr) is given by 


log Io(mr) & Io(mr) aod (52) 


Using the series for J,(mr) given in (10), substituting 
into (52), and considering only the first nonvanishing 
term, gives, 


22 
log Io(mr) (53) 
Hence the detector for the case 7, < 1 appears to be 
2 
gi) = B= Sn. (54) 


This is the so-called biased square-law detector. However, 
if we take the expected value of (54) on condition that 
only noise is present, we have, 


Eye) = {() 


since from (23) E,(r?) = 2. However, at the threshold 
parameter 7 = m1, using (23), we have 


> 


,(e) = 2. (58) 


254 IRE 
We can therefore conclude that the square law detector 
plus a bias term contributes the required information for 
a terminal decision when signal is present, but it does not 
contribute any information, on the average, when noise 
alone is present. From (21) it is seen that when noise 
alone is present the fourth order term contributes the 
required information #,(z) for a terminal decision. Com- 
paring (55) to (28), it is seen that when signal is present 
the average number of samples required for termination 
for the required detector, (21), is twice as large when 
compared to the detector given by (54). Thus, the fact 
that termination is required when 7 = 0 as well as when 
n = m, leads to an increase in the number of samples 
when signal is present. Therefore, we gain convergence 
at 7 = 0 at the expense of losing one half the information 
at 7 = m. The amount of information lost is that required 
for the sequential test to terminate at 7 = 0. [Here we 
interpret information as the quantity /,(z).] 


VIII. To SHow Tuart For 7, > 1 AND dy) < 1 THE 
OptimuM Derecror Law Is 4 BERNOULLI DETECTOR 


Let us refer once again to the expression for the general 
detector law for n samples, 


2(n) = as log [1 + dy(e*”"Io(mr.) aie (56) 
s=1 
For mr > 2 we can approximate [5] Jo(mr) by the 


asymptotic expression, 


Cae 
V 2mm 
2, the probability that r 
2 is greater than 0.90 or 


Io(mr) & (57) 
It can be shown that when y, = 
exceeds unity on condition 7, = 
expressed mathematically, 

pr > 1) > 0.90; > 2. 


Hence, the approximation is valid almost always. Putting 
(57) into (56) gives for the detector law, 
(—30.? +173) 
seas 
V Qa mr; 


z(n) & >» log [ + a’ 
a=1 
If we further approximate the logarithm by its first two 
terms in the Taylor expansion, we have, after some 
manipulation, 


a(n) = NE a,(° 


(58) 


(—271? +717: ] 
anes 1) 
V 2am; 
1 el am? tril )| 
“i a,( = I ie 59 
| 2 0 ay ( ) 


Furthermore, assume that the parameters d) and 7, are 
such that inequality, 


de (? 
2 


{—391?+17] 


Ta (60) 


Shen 
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is almost always satisfied. In order to obtain an upper 


bound on d, for which (60) holds, we consider the following 


approach. The standard deviation ow r for values of yn, > 2 


is approximately, 


= V24+7-—(@)) = v2. (61) 


This follows from the fact that the mean square value of 
ris (2 + nj) and the squared average value of r for m, > 2 
is approximately nj. The probability that r is less than 
m +3 V2 (where 3 1/2 = 
is certainly greater than 0.90 when 7, > 2 or expressed 
mathematically, 


Drift < m +3V2} > 0.90. 


inequality of (60) almost always. Therefore, 


—[yn.243V2n1 
do ( (63) | 


2 V 20m 
We replaced r in the denominator by 7, strengthening 


the inequality even more. Neglecting — 1 in the praca 
and taking logarithms of (63) gives 


ae 1) <elr 


do 


loee. 9 


<K log V2rm — (ni + 3-V/2m), 


< (4 Hi a 3-V/ 2m) 


or 


d <K Qo m1? #3 V 2m) 
0 


December 


three standard deviations) 


(62) 
Hence, replacing r in (60) by 7, + 3 V2 strengthens the 


nn ee 


eee 


(64) 


For a given value 7, > 2 if d, is chosen so that inequality | 


(64) is satisfied then inequality (60) is almost always 


satisfied. Hence, we can further approximate (59) by 


considering the first term only, or, 


Cle 


z(n) & de?" — wee — dyn. (65) 


Qarmir; 
The detector law of (65) weights the peaks of the signal 
very heavily. 

Since d, < 1, it follows that on the average the number 
samples is n > 1, as many samples must be observed in 
order to allow for the occurrence of at least one pulse 
when signal is present. With these assumptions we 
approximate the sum in (65) by an integral or, 

ie 


(ES 


T= 7 le Wee xy ie re *” = dr 
i=l ae aes 
—¥(r— eee 
ee) are 
= e?™ 66) 
0 ae Me 
Again since most of the samples will be noise [7.e., (1 — do)n 


for large n] we average on condition that noise is present. 
Before integrating (66) we note that the integrand con- 
tributes to the integral primarily in the neighborhood of 
m. Hence, the term 1/+/r has negligible effect on the 


957 


ntegral everywhere except at r = 7,. We further choose 
1 Sufficiently large (y, > 3) and replace the range of 
ntegration to cover the entire real line. After all these 
‘ssumptions, we finally obtain, 


xt 
2717 2) 


en 
e° re 2(r—n1)? 
mZ~ 


V 20 


a 
371? 


(67) 


poe 
‘s gies 

i=1 V 2am; as 
Hq. (68) is, of course, only an approximation. The prob- 
ibility that a noise sample will exceed the threshold 
a — 3°V/2 (ie., three standard deviations to the left 
pf 7) is 


ne’ (68) 


po(r) = en bim-3V21 et (69) 


for m > 3-V/2. 
Che number of samples exceeding the threshold is approxi- 
mately, 


JONG ae 15 (70) 
Substituting (70) into (68) for n gives, 
7 = hye (71) 


fe for 4, > 1 and very small do, the sum in (71) 
sontributes approximately the same information as a 
Bernoulli detector set at a slicing threshold 7) = n, — 3/2 
which weights a success by e” . If we are given a sequence 
of independent random variables 7,, --- 7,, then a Bernoulli 
Sequence can be derived from the given sequence by the 
Following slicing operation: 


F(r;; Yo) = X;, 


where, 


G 


il WHS Pe SS 
== ()) Wien is << Whe 


The sequence {X,} ¢ = 1, 2, --- m consists of zeros and 
ones distributed in a Bernoulli distribution. Substituting 
(71) into (65) gives, 


2(n) = dok,e?”” — don. (72) 
‘The decision regions for detecting Hy and H, are therefore, 
1 B 
| Ne eee og ss aun | = Jak (73) 
| de?” ae 
and, 
| k, 2 a og oe = aun | ae (74) 
| dye?” a 
which holds when the conditions, 
dy Gear ae o> 1; (75) 
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are satisfied. Thus when we have a large number of cells 
or channels which have very low probability of receiving 
signal (dj «< 1) and where the probability of a noise 
sample exceeding the peak value of the signal is much 
greater than the probability of occurrence of signal in a 
channel (e ?"’ > dy) and where the peak signal-to-noise 
ratio is very large, a threshold detector law is approxi- 
mately the best. We see that power integration or 
equivalent methods are not as good as a detector which 
weights the peaks heavily. 

Stated in another way, the problem considered here is 
equivalent to considering two Bernoulli random variables 
with threshold probabilities, 


3 


(76) 
do esi (77) 


——?h 
Do = 2 271 


Pi 


We will now prove this. This also serves as a verification 
of the results which have just been obtained. 

For a Bernoulli test the observable which is measured 
is given by [2], [3], and [8], 


in 
n) = 4 toe 2S et | 78 
2(n) Seman =, (78) 
From (76) and (77) and using the Taylor series expansion 
and approximation to the logarithm we obtain, 


mH 
~~ 


log = log [1 + doe?”"] & doe”, (79) 
0 


. —1iy,2 . . 
since we have assumed d, Ke *” . Similarly, 


log if = Bs) = —log (1 — 5) Sih (80) 


dy <en? ale 
From (78) and (80) we obtain, 
2(n) = alia = dy == 1 da, 


ab 2 
IK dee iden 


(81) 


ase" >> 1. Thus (81) derived for the Bernoulli case is 
exactly the same as (72) which was previously developed 
under the same set of approximations. We can now state 
that under the conditions previously stated the best 
detector given by sequential theory for detecting a sine- 
wave carrier in noise of large peak-signal-to-noise ratio and 
very small duty ratio is equivalent to the detector law 
corresponding to the detection of a random sequence of 
ones of probability dy interleaved (with no overlap) with a 
random sequence of ones of probability e *”* such that 
d, e?*’ <1. The random sequence of ones of probability 
e *™* ig considered as the noise sequence while the 
interleaved sequence of probability do is the signal 
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sequence. Since dy « e *"’ there are many more noise 
pulses than the desired signal pulses. Thus we actually 
have another example of a signal embedded in impulse 
noise with an equivalent signal-to-noise ratio which can be 
defined by the ratio of the signal and noise probabilities. 
The equivalent signal-to-noise ratio is dy) e?"’ «<1. The in- 
equality d, e’’ <1 can be written as 47? «K — log dp. 
Hence, when the information gained by observing a pulse 
is much greater than the peak-signal-power-to-noise- 
power ratio of the pulse a threshold detector is a good 
approximation to the best detector. The measurement 
process then involves only counting. This holds when 
7, >> 1. The slicing threhold can be set at approximately 
ro = nm — 3V2. This imphes that when a pulse is pre- 
sent it will exceed 7, almost certainly. That is, 


Dit th OV 2) 090, a 1. 
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CORRECTIONS 


Robert Price, author of ““Optimum Detection of Random 
Signals in Noise, with Application to Scatter-Multipath 
Communication, I,’ which appeared on pages 125-135 
of the December, 1956, issue of these TRANSACTIONS, has 
requested the following corrections. 

In footnote 9, section 1, the fifth line should read “if the 
detector output consists of ---” 

In Fig. 1 on page 127, change M to N in the scatter 
paths. 

On page 129, in the first paragraph following (26), the 
second sentence should begin, ‘“The simultaneous integral 
equation (22) ---” 

In footnote 23 on page 130, the book by Courant and 
Hilbert should be identified as Volume I. 

On page 131 the term in parentheses on the last line of 
(54) should be ¢,(¢ — 2). 

In (60) on page 132, change the italic capital I to 
script capital 9. The same correction should be made in 
the tabulation in the same column on the third line under 
the heading ‘‘This Paper.” 

In (66), replace 8 by g in the denominator. 

Insert ‘See (4.35) and (4.36),” at the end of the first 
paragraph of footnote 26. In the same footnote, the first 
term of the first two equations should be F;’’. 

In (69) change = on the first line to =, and change r” 
on the second line to r. 

On page 133, the heading on the fourth line of the 
second column should be identified as Case II. 

On page 134, two lines below (75), insert ‘‘n is given by 
(65). As w — 0,” ete., and on the following line delete 
“(next page)’’ at the end of the line. 


) 


Immediately below Fig. 5, the last word on the first 
line should be ‘‘least.’’ 
In (76) change the italic capital J to script capital J. 
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David Middleton, author of ‘On the Detection of | 


Stochastic Signals in Additive Normal Noise—Part I,’ 
which appeared on pages 86-121 of the June, 1957, issue 


of these TRANSACTIONS, has requested the following — 


corrections. 
In (24) on page 90, ky’ should immediately follow ¥. 
In (74) on page 95, replace y by « in the upper limit. 
On the same page, seven lines below Fig. 1, replace 
‘present’ by ‘‘preset.’’ 


In footnote 18 on page 96, the third line below the 


equation should begin: by K,(| t — 7 |). 

In (98), replace Bj in the integrand by By. 

On page 100, the term log (K/T,) which appears in 
(101la), (101b), (103a), and (103b) should be log K — I%. 

In (121) on page 102, the term Ky, at the end of the 
first line should be Ky). 


In the first line of (128a) on page 103, the term above _ 
>> should be n + ; the last symbol in the exponent should — 


be Nin: 

On page 106, three lines below (148), 7'/n should be 
enclosed in parentheses. 

In (168a) on page 109, the second Z should be 2. 

The last line of (187) on page 111 should begin with 
a + sign. On the same page, (—1)* should be inserted 
before B-'’?-”? on the second line of (188). 


In (222) on page 115, the term before the equal sign 
nould be exp {— p, | ¢ |}. 

On the same page, replace A by A’ in the first equation 
f (225). Here A’ is determined by regarding the second 
uation of (225) as an identity. It is easily seen that 
’ = 2A. Replace A by A’ in the following, unless other- 
ise indicated. 

Also on page 115, on the second line of (229), the symbol 
llowing V7 should be (t — t¢’). 

In (230), (231), (233), and (236) on page 116, replace 
by A’ = 2A. 

On page 116, on the second and third lines of (231a), 
e last term should be e °"’. In (231b) the last term on 
he second and third lines should be e*“'~””. 

In (239) delete the 2 in the second relation. 

In (241) and (242) on page 117, insert a minus sign 
efore F,,. Also insert a minus sign after dx in the first 
ne of (243). 

In (244), replace the 4’s by 2’s and the 2 by 1. 

In (245a) replace 4 by 2 and place the radical in the 


orrespondence 


The Sample Space Trajectory of 
Time-Shifted Signal Vectors* 


_ While investigating optimum phase-de- 
termining filters,! it was convenient to give 
some geometric significance to the trajec- 
ory traced by the tip of the vector repre- 
enting a band-limited, time-shifted signal 
n sample space. 

One constraint on the tip trajectory of 
uch a signal vector has been stated by 
hannon.? This requires the signal to re- 
main on the surface of a hyper-sphere of 
radius 


r= V2WE (1) 


W = highest frequency in the signal 

E = total energy in the signal. 
x(1/2W) 

_ It may be demonstrated that another 

‘constraint on the trajectory of a time- 

shifted signal is that it must lie on a hyper- 


| . . . . . . 
plane having identical direction cosines (or 
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denominator for h,. In the last two relations, insert factor 
1/2 and put the radical in the denominator. 

In (246) delete the first factor 2. Place the first radical 
in the denominator, and in the second line of the equation 
1 + ¥ should be V1 + ¥3. 

In (247a) and (247b), multiply by 1/2. 

In (248) delete the factor 2, and place the first radical 
in the denominator, outside the parentheses. The second 
term should be preceded by a minus sign. 

On page 118, in (254) delete the factors 2 and place an 
equal sign between h, and h* on the second line of the 
equation. 

In (256) replace the factor 8 by 4, insert minus signs 
before c, in the last two exponents, and insert a minus 
sign before the second term. 

On the first line of (268), delete the minus sign before 
the last exponent c,(t — 7). 

In (269a) and (270) the first (6, + c¢,)’ should be 
(b, — ¢,)°. Insert a minus sign before the second term in 
(270). 


x(3/3W) 


INTERSECTION OF ENERGY SPHERE 

AND HYPER-PLANE GIVE THE TIP 

TRAJECTORY OF TIME-SHIFTED 
SIGNAL VECTOR 


x(2/2W) 


HYPER-PLANE CONTAINING TIP OF 
TIME-SHIFTED SIGNAL VECTOR 


CONSTANT-ENERGY SPHERE 


Fig. 1. 


having equal intercepts on all coordinate 
axes). The perpendicular distance between 
the hyper-plane containing the time-shifted 
signal and the origin is proportional to the 
integrated area under the signal. 


. 


* Received by the PGIT, May 23, 1957. 

1 This work appears as a portion of ch. 5 of Dis- 
sertation entitled, ‘‘The Detection and Phase De- 
termination of Signals in Additive and Multiplicative 
Noise,’ submitted June, 1955, to Polytechnic Insti- 
tute of Brooklyn, Brooklyn, N. Y., in partial fulfill- 
ment of the requirements for the author’s D.E.E. 
degree. ees 

2C. E. Shannon, ‘‘Communication in the presence 
of noise,’’ Proc. IRE, vol. 37, pp. 10-21; January, 

9, 


Note that? for a band-limited signal x(¢) 
translated by r 


ine as (sta 2: *) 


4~=-2 


sin W(2at — 1) 
W(2rt — 2) 


Integrating and using the known value of 
the sine-integral gives 


As -,) 


— Ww [ox de) 


Eq. (2) represents the equation of the 
hyper-plane to which the tip of the time- 
shifted signal vector is constrained. This 
hyper-plane has equal direction cosines (or 
equal intercept values with all coordinate 
axes). The distance, along a normal, from 
the origin to the hyper-plane, which is set 
by the expression on the right-hand side 
of (2), is proportional to the integrated 
area under the signal. Bien 

One can view the intersection either as a 
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trajectory along the surface of the constant 
energy hypersphere or as a path described 
by the signal point in hyper-space as the 
axes are rotated. 

The intersection of the hyper-plane and 
hyper-sphere is a hyper-sphere reduced in 
dimensionality? by 1. (See the three-di- 
mensional periodic signal example in Tig. 
1, p. 257.) This has delimited the trajectory 
only very slightly, but it is useful in confirm- 
ing our beliefs concerning those elements of 
a signal which do not carry phase informa- 
tion. As the integrated area under the signal 
becomes larger, the hyper-plane is just 


3 In the limit of infinite dimensionality, the result- 
ing infinite energy must be normalized. 
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tangent to the hyper-sphere at one point. 
This point corresponds to the signal whose 
time-shift causes all samples to rotate into 
themselves (e.g., direct current). 

In order to make the points on the tra- 
jectory as disparate as possible, the inte- 
grated area under the signal should be zero. 
This should give the largest possible radius 
to the hyper-sphere of reduced dimension- 
ality created by the intersecting surfaces. 
This confirms our intuitive notions that 
direct. current (or integrated value of the 
signal) carries no phase information and 
should, therefore, be avoided in generating 
limited energy signals whose phase or time- 
shift is to be measured. 
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In a limited sample space of three dimen- | 
sions, it is interesting to note that all 
time-shifted signals of three dimensions are | 
identical except in magnitude and direct 
current content. 

The remaining parameters needed to 
describe the tip trajectory are the Fourier 
coefficients which define additional hyper- 
planes in sample space and further con-— 
strain the intersection with the constant-— 
energy hyper-sphere. | 
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